More things I’d like to see done in Emacs

Wednesday, 30 November 2022

Over a year ago, I wrote a text titled “What I’d like to see done in Emacs”. There I mentioned a few ideas and projects related Emacs that I think are worth perusing.

Some time later, one has materialised (Compat), one has been worked on in a variation (package-vc) and one more has been worked on, but I lost the patch…

Nevertheless, I’d like to propose a few more ideas along the same lines. An early “five-year-plan” so to speak. Of course I don’t (and can’t reserve) any exclusivity on tackling these projects – anyone interested pick these up gladly, with our without my cooperation.

Eww Unmangler

It is no secret that the web is a mess. The last 30 years have demonstrated that HTML hasn’t been expressive enough to satisfy the needs of a world-wide-web. CSS and Javascript didn’t suffice as a thin layer, but became the material on which the desired path was built that led us to where we are. Combine this with the economic interests of platforms centred around advertisement and the trajectory, especially the deviation from the initial intentions doesn’t appear that surprising – in retrospect.

It doesn’t come to a surprise to many that the built-in browser EWW struggles with handling anything beyond the simplest websites. For a website to work with these kinds of text-oriented web browser (links, w3m, …) a web-developer has to consciously restrict their technologies and attempt to design a site that can be used with simpler tools. This is not a given.

To make the web work with EWW, or at least some segment of the web usable, it appears that manual intervention is necessary. What I have been thinking about it a tool that hooks into EWW, and depending on the site applies transformations to make the site readable: Reformatting headers, removing unnecessary elements, making the page more readable.

This requires a database of popular websites (For programmers the focus would initially lie in pages like StackOverflow, Reddit, Quora, GitHub, GitLab, …), and ideally a DOM-manipulation language to make it more maintainable. This could make use of eww-after-render-hook, but I fear that this would be too late. Instead it might be necessary to advise eww-display-html (or in the future extend EWW to make these kinds of manipulations easier).

I have been working on an initial sketch for a package like this, but am not satisfied with what I have written up until now. As mentioned above, the main problem is not technical but in finding an elegant way to express the problem.

ELPA Submission Helper

Sharing and publishing packages should be easier than it is now. Of course you can just upload somewhere, but then any interested user would have to fetch, install check for dependencies and look out for updates it manually. There is a reason why package managers like package.el are popular.

Sadly it is not as convenient on the other end. Sure, contributing a package to GNU or NonGNU ELPA just requires sending an email to emacs-devel@gnu.org, but there remain a number of implicit conventions that new contributors may be confused by. In my own experience, these include but are not restricted to:

I believe it should be possible to provide a little package that checks and interactively/mechanically prompts the user most of these questions, resulting in a message that can be simply sent out.

This could include preparing a Git repository and suggesting a Git forge like Codeberg or Sourcehut. After explaining the difference between NonGNU and GNU ELPA, a request to sign the CA could be prepared as well, if the maintainer chooses to distribute their package as part of GNU ELPA, which is part of Emacs. It might even make sense to clone emacs/elpa.git or emacs/nongnu.git and directly prepare a patch.

If package-lint is added to NonGNU ELPA, then that could also be integrated into the process.

Another advantage of this approach is that the message could be generated with some special header information that would allow a CI-like process to detect the message and run a few automatised tests on some recent versions of Emacs, to be shared on the mailing list.

I don’t have any code for this idea yet, but preparing a preliminary version shouldn’t be that difficult, if there is interest in a little elpa-helper.

Distributed Content-Addressable Elisp Pastebin

I have a lot of small utility functions that I wrote-up once, and never changed since. One of the most frequent commands I use is this:

(defun site/kill-region-or-word ()
  "Call `kill-region' if there is an active region.
Otherwise kill the last word, just like in Unix."
  (interactive)
  (call-interactively (if (region-active-p)
                          #'kill-region
                        #'backward-kill-word)))

The chance is slim that this will ever require changing. This is finished code, and will work for as long as all the function it uses work.

If I wanted to share this snippet with someone else, I don’t think that creating a package and submitting it to an archive would be the right approach. I am not fond the idea of packages that just collect unrelated functionality like crux or consult. Sending them my code directly works just well enough for 1:1 situations.

What I really want is some pastebin service dedicated to (Emacs-)Lisp code. Ideally content-addressable and distributed. Perhaps this could be based on Torrents, perhaps IPFS, or something else entirely. Maybe this could also use CRDT as a basis?

Perhaps an example demonstrating can clarify my idea. Assuming some name, say dcap (distributed content-addressed pastebin), I could define a little function in my own configuration as follows

(dcap-defun kill-region-or-word ()
  "Call `kill-region' if there is an active region.
Otherwise kill the last word, just like in Unix."
  (interactive)
  (call-interactively (if (region-active-p)
                          #'kill-region
                        #'backward-kill-word)))

Basically, the same as above, with the minor difference that I used a macro called dcap-defun instead of defun. This would define a function for my own use, and declare a public snippet. The snippet would then be addressed using a hashsum, say 6465c9e5c3426b66a9fa45663946884faebc80db3260c55192d1cd4322472450.

On the other end, someone might decide to use this command and include it on their end. They might write something like

(defalias 'kill-word-or-region
  (dcap-fetch-func "6465c9e5c3426b66a9fa45663946884faebc80db3260c55192d1cd4322472450"))

Note that the name used here is not the same as the one I used. What dcap-fetch-func does is retrieve a definition from the network (say using a more generic primitive like dcap-fetch) or use a cached copy, and ensure the return value is a function.

As a hashsum like this can be inconvenient, having alias lists could be useful. Each such list could be designated by a URI that contains an association of human-readable names to hashsums.

$ curl https://some-website.net/path/to/my-alias.file
((kill-region-or-word "6465c9e5c3426b66a9fa45663946884faebc80db3260c55192d1cd4322472450)
 ;; ...
)

If configured, you could then do the same as above using a more sensible name (or a more convenient macro):

(defalias 'kill-word-or-region (dcap-fetch-func 'kill-region-or-word))

Implementing this is primarily a technical issue. Points that have to be considered are:

It might be that a distributed system introduces too much complexity? It is probable that the network couldn’t just rely on idle Emacs instances, and a stand-alone node implementation would have to be implemented.

The important thing is: Reducing the overhead in sharing small improvements is one of the things I admired about the Emacs Commune. Packages usually imply there is a long-term project, that might grow over time. Copying code verbatim can be a nuisance and is not always reliable. I believe there is a niche between the two that can be satisfied.

Esoterical Text Manupulation Language

Another far-fetched idea is a little programming language for text manipulation. The twist is that I’d like to combine various features from different paradigms and programming languages.

My motivation stems from an appreciation but persistent scepticism regarding modal editing. I do believe that an expressive language for manipulating text is of use, but I don’t think that vi’s approach – throw you into “normal mode” at first, and have “insert mode” be a something you request – is ideal. I know that evil-mode, an emulator mode for people who have previously been using “Vim” (vi’s little brother), has the option of inverting this by default, but I remain unsatisfied. Other modal editing systems like Objed might be interesting if developed further, but I have been thinking if an entirely different approach could be viable instead.

So how about this: A programming language that has buffer ranges as a primitive data type, and treats these as mutable strings. We borrow the stack paradigm from Forth, implicit mapping of function over lists from APL, and intuitive regular expressions from AWK. This would allow us to express an intention like

Take all empty substrings at the beginning of each line and append a constant string.

There are many ways we could write this. Say we want to be verbose, and type out every intention word for word:

/^/ match-all "foo" append

we can imagine the stack being manipulated by each command as follows:

TOS                 ;; We start with an empty stack (TOS: Top of Stack)

TOS
/^/                 ;; A regular expression matching the beginning of a line

TOS
[(0;0) (41;41) ...] ;; A list of buffer intervals that match /^/

TOS
"foo"               ;; A constant string
[(0;0) (41;41) ...] 

TOS
[(0;3) (41;44) ...] ;; The intervals have been modified

This is fairly trivial, but how about an idea like

Match all lines that include of “bar”, “baar”, “baaar”, … and reverse their order of occurrence.

This time let us assume a terse syntax,

/ba+r/ ml lr

Again, we begin with a regular expression, request all intervals of the lines that match it and then reverse the list – which has an effect on the buffer. We press enter and the program is executed. This could involve a special interpreter or it could be compiled into Emacs Lisp.

One more feature I would like to see is strong typing – specifically interactive and immediate strong typing. While we are at it why not throw dependent typing into the mix? Let us consider an example to illustrate my point. Imagine the following

Replace each instance of “foo” with a number in increasing order of their occurrence.

This time we use a single unicode character for each command and assume an appropriate input-method is provided.

foo×↑ρι%s→

This time regular expressions (foo) and strings (%s) aren’t quoted at all. They are distinguished by being regular ASCII characters, so adding quotation is optional. Next we…

Ideally this should fail and ding right after typing , because the values on the stack are a list of integers, and not strings, before anything is even executed. Replace the %s with a %d and the program types. It can now be executed. While this is going on, and since the typing is interactive, the active buffer intervals and their replacements can be visualised on the active portion of the window.

(A different question is if you actually want this degree of strictness in a convenience language…)

The main issue here will be figuring out a good vocabulary (which will probably have to be user-extensible) and a flexible syntax to accommodate its needs.

A User Compat Library

Last year I shared the idea of working on a Forwards-Compatibility library for Emacs Lisp, and it has since not only been implemented but also published. It allows versions of Emacs going back to 24.3 (released 2013) to make use of a number of newer functions and macros. I am currently working on preparing support for Emacs 29.1, and hope to release it soon.

One restriction I drew when starting the project was that it won’t include any user-facing code. Any function that is also a command would only be usable as a function. The development branch for Emacs 29 intentionally leaves out the setopt macro. This is because Compat is a package that is rarely installed manually. Instead it is added as a dependency. And as dependencies are, they might appear or disappear, depending on what packages are installed and how clean you keep your package list.

(This argument is actually not that solid, because Emacs intentionally doesn’t draw a line between developers and users. If Compat is installed on Emacs 24.3, you could be using and-let* in your personal code in init.el and suddenly be confused if the dependency is removed.)

The idea here is simple: Provide a package with these missing definitions (commands and user-facing macros), that is supposed to be explicitly installed by the user.

There is not that much more to this idea, just a nice thing I think some people would appreciate. Being a package people would consciously installed, it could risk being more invasive and opinionated, e.g. by (pseudo-)depending on other packages in ELPA such as project, xref, etc. to ensure the newest versions are installed.


I am curious to hear if anyone things if these ideas have any merit. It would be great if someone were interested in collaborating on developing or even implementing these projects. Right now, I am under the impression that I am reaching a limit as to how much time I want to invest into a hobby like Emacs development. I am a full-time student (and part-time TA) after all. This means I’ll be thinking twice before starting any new Elisp project, as I always have other ideas I would like to work on as well.