Some code to rescue your Diigo bookmarks

Sometimes, you notice you have some digital housekeeping to do, so you think: easy, I’ll just write a couple of lines of code to do the job. Bad idea – this will give you another project to abandon in no time, and take lots of time as part of the bargain. But it may have been worth it – here are some Python routines to manage your bookmarks on diigo.com and in your Nextcloud.

Diigo was unique when I first started using it around 2010 after having to leave delicious. Browsing the web, keeping what you see, making you tag and describe what you found, and why you thought it was worth keeping. It became an essential tool in managing my knowledge, and so I have been a paying customer for years.

But the service is a dead end, and one that is becoming more and more uncomfortable for a couple of reasons. There are those things that seemed like a good idea back then – putting most of what you read and what you think of it on a public directory – but that was when social media seemed much more of a promise than a nuisance. I do no longer think that the value this might have to a few hypothetical followers offsets the privacy risks, and I don’t want my bookmarks to become training data for AIs.

The second, and more important, reason for leaving Diigo is that it has started acting up: bookmarking is out of sync, searches do not turn up what I know I have been saving, and the service has been unreliable. So I guess that Diigo is no longer properly maintained – and it has seen no further development for years; As of 2024, it is still running, but I guess they might take it down any time soon.

Doing it by the book: What the official Diigo API can do for you

At least, there is hope. Diigo offers you some tools for exporting to a HTML file or a CSV table, and there is an API. The CSV is usable, so the first things I wrote was code that took that CSV and uploaded it to my Nextcloud. (That is not perfect but it works, although you lose the original creation date of the bookmarks.) But you would still have to delete all your bookmarks by hand to remove them – or, at least, set them to private. The Diigo site gives you options to edit in bulk, but that would still be a piece of work – I gathered a couple of thousand bookmarks over the years. So let’s try the API to automate that.

The Diigo API gives a strong vibe of “I’ll complete that when I get round to it”. Documentation is sketchy, you have to generate an API token to use along with user and password, and it basically offers you only two functions: Get a bookmark, write or overwrite a bookmark.

Failed to delete bookmark 'Nowcasting fatal COVID‐19 infections on a regional level in Germany - Schneble - - Biometrical Journal - Wiley Online Library' (https://onlinelibrary.wiley.com/doi/10.1002/bimj.202000143). Status code: 400 API Message: {

After some experimentation, I found a third, undocumented function to delete bookmarks with a DELETE call, which was the whole point of using the API. This sort of works, but these API calls are heavily rate-limited. After a dozen deletes, you are forced to wait some minutes, then you may try to delete another 10 bookmarks until the API stops you once again. And even with precautions for that, my first scripts kept on crashing.

Surely, there’s a better way?

Well, there is a better way, at least for some of the things you may need to do. The Diigo web page isn’t limited to those two-and-a-half methods of the official API. It uses additional endpoints for getting the actual work done, and an external script can use them, too.

Browser tools, network tab; showing the diigo.com page using a /interact_api/load_user_items endpoint to get bookmarks

It also makes getting things to work a bit easier and more difficult at the same time. Rather than using an API token to authenticate external calls, it uses session cookies. So you will have to authenticated your account in a browser window, solving one of those annoying Click-all-traffic-lights-while-we-move-them-round CAPTCHAs.

But after that, you’re good – the script saves the cookies from that session, and reuses them to authenticate calls to find, list, filter, and modify bookmarks, This method – going through what I call the Interaction API – is much faster, and there is no rate limitation either.

You’re not allowed to do things fast, though

The “Write Bookmark” method requires the code to spoof a browser, otherwise it  doesn’t work. That can easily be done in the header of the API calls, but it is not enough for three more API calls that would speed things up enormously: delete_b, mark_readed, and convert_mode.

These calls return a “403 Forbidden” message, and I cannot get them to work. Which is a shame: The website uses them to bulk-edit and bulk-delete, but as long as I cannot use them, I will have to send 100 single API calls instead of one call with a list of 100 bookmarks.

My guess is that these methods only work on the Diigo server itself. I’ve asked Diigo, when I get a reply, I can try to update.

What the code does…

The main.py script features a simple command-line menu system to allow you to do some work:

  • Set all Diigo bookmarks to private
  • Export, delete, and re-import to Nextcloud
  • Export and import Nextcloud bookmarks in the same CSV format that Diigo uses

…and I’ll still have to do the impressive stuff

We live in the time of AI, and of course, I intended to have an AI language model do some maintenance work for me: Check the bookmarks, augment the description with an AI-generated summary, suggest tags. (Tags! Don’t get me started on tags! Let’s just note that humans are not very consistent in tagging – a sizeable portion of the tags in my Diigo file has been used only once.) Maybe, some day, write an embedding for that summary into a vector database to allow you to query your bookmarks with a chatbot.

But this will have to wait.

Verwandte Artikel:

  • Delicious noch köstlicher machen (Monday, 19. January 2009; Schlagworte: delicious.com, Gedächtnis, Mr. Wong, Onlinejournalismus, Social Bookmarking)
  • untergeek lernt Drupal (Wednesday, 22. July 2009; Schlagworte: Drupal, memory_limit, Openatrium, PHP, Projektmanagement, Speicher, Strato, Tools)

2 thoughts on “Some code to rescue your Diigo bookmarks

  1. Oh ja. Ich finde, dass Diigo seit der Abschaffung der Listen und Einführung von Outliners nachgelassen hat. Seitdem nimmt man auch kein Lebenszeichen mehr von den Betreibern wahr. Gerade jetzt hätten sie die Möglichkeit, ihre Plattform etwas aufzufrischen und KI-generierte Seitenzusammenfassungen, KI-generierte Tags etc. zu implementieren, um die Plattform auf einen moderneren Stand zu bringen. Dennoch finde ich es ist bis heute die beste Möglichkeit seit Delicious sich an Yahoo verkauft hat, seine Bookmarks zu verwalten. Highlighter benutze ich sehr gerne. Von Synchronisationsproblemen habe ich noch nichts mitbekommen — aber manchmal muss man seine Library im Browser öffnen, damit das Browser-Addon wieder Kontakt zur Hauptseite hat und die Daten richtig speichert. Die “social” Features haben mich noch nie interessiert — bzw. ich suche mir gerne aus, wem ich mitteile, dass ich überhaupt existiere bzw. was mich beschäftigt. Von daher ist bei mir per Defaulteinstellung alles von Anfang an privat.

    Ich sehe mir gerade die X-te Lösung an, die eventuell Diigo ersetzen könnte – in meiner persönlichen Bewertung haben eigentlich alle grundsätzlich schlechter abgeschnitten als Diigo. iki.ai heißt das Ding (z.Zt. im Angebot auf der Seite Appsumo). Zusätzlich zu Bookmarks kannst Du auch noch u.a. YT-Videos speichern u. transcribieren lassen, Dokumente hochladen und mit deiner eigenen Bibliothek chatten, natürlich auch innerhalb der Plattform im Netz recherchieren und so gefundene Informationen zur eigenen Bibliothek hinzufügen. Beim ersten Test wirkt es ganz cool. Aber auf den zweiten Blick wäre ich vorerst vorsichtig damit, vorbehaltlos meine Dokumente hochzuladen. Zum Bookmarken von Artikeln und Videos ist es vielleicht vorerst OK. Für persönliche Dokumente lieber nicht — die PDFs liegen scheinbar unverschlüsselt in irgendeiner Cloud. Eine Exportfunktion vermisse ich ebenfalls noch. Aber vielleicht tut sich da ja noch was im Laufe der weiteren Entwicklung.

    • Tatsächlich bin ich nach einigen Wochen Nextcloud-Bookmarks so mittelzufrieden, aber es funktioniert. Meine Hoffnung ist, dass KI-Funktionen wie Transkription und automatische Tag-Generierung nach und nach entstehen und ich mir nicht selbst was schreiben muss. :(

Leave a Reply

Your email address will not be published. Required fields are marked *