Thursday, June 17, 2010

Deleting duplicate BibTeX entries from Mendeley

I've recently moved to using Mendeley as my main reference management tool. It has some pretty neat features except for a few that keep bugging me. One of those is that Mendeley seems to maintain record of deleted entries in its database and those end up in its BibTeX export. Worst of all is that I am always getting duplicate entries in the BibTeX file.

Today I found a neat solution based on the post by Simon Greenhill. Here's how it works for the citation key:
  1. Locate your Mendeley SQLite Library (~/Library/Application Support/Mendeley Desktop/youremail@www.mendeley.com.sqlite)
  2. from your terminal type:
    sqlite3 your-email-address\@www.mendeley.com.sqlite
  3. Type:
    SELECT COUNT(*) as entries, citationkey FROM Documents GROUP BY citationkey HAVING entries > 1;
    This basically queries the table "Documents" for all duplicate citation keys.
  4. Then,
    DELETE FROM Documents WHERE id NOT IN (SELECT MAX(id) FROM Documents GROUP BY citationkey);
Voila!

Cite as:
Saad, T. "Deleting duplicate BibTeX entries from Mendeley". Weblog entry from Please Make A Note. http://pleasemakeanote.blogspot.com/2010/06/deleting-duplicate-bibtex-entries-from.html

13 comments:

  1. Nifty trick! For the record, I'm one of the members of the Mendeley QA department, and this post caught my eye.

    The fact that we're generating duplicate content for a single citation key in bibtex files sounds like a bug on our behalf and something I'm investigating to further onto our developers. Hopefully, this should be a proper fix within Mendeley Desktop as part of a future release.

    ReplyDelete
  2. Thanks for taking care of that.

    In fact, Mendeley seems to keep bibtex entries even after deleting these entries. A fix for that would be most valuable!

    Best,
    Tony

    ReplyDelete
  3. Yes, a fix for this problem is most urgent. The Mendeley-generated bibtex file is useless if it contains duplicates.

    Funny about the proposed interim-fix above: after I apply it Mendeley just recovers the database and nothing changes in the bibtex file (although it is re-written).

    I have Mendeley (latest stable version) on Ubuntu 9.10.

    When I run these sqlite instructions, it finds several duplicates and even a few triplicates. But none of these duplicates/triplicates appear as such in Mendeley itself. But they are still multiplicated in the auto-generated bibtex file.

    Anyway, if you do a "Select All" and export a bibtex file manually, you get a bibtex file without any duplicates.

    All in all, it seems like the Mendeley team have got some work to do on this issue. I for one do not understand what is going on.

    ReplyDelete
  4. Any news on this, Mr. and Mrs. Mendeley? This "bug" is still prevalent in the current version (0.9.8.1).

    G.

    ReplyDelete
  5. A partly related issue that keep bugging me is how to insert latex math equations in the title of references in order to for example cite the following correctly? $ is translated to \$ which makes it impossible for latex to understand.

    Candès, E.J. & Plan, Y., 2009. Near-ideal model selection by $\ell_{1}$ minimization. The Annals of Statistics, 37(5A), pp.2145-2177. Available at: http://projecteuclid.org/euclid.aos/1247663751.

    ReplyDelete
  6. It's alive!!!. It seems to work for me. Just what I was looking for.

    ReplyDelete
  7. Vote for fixing this bug at mendeley feedback forums:

    http://feedback.mendeley.com/forums/4941-mendeley-feedback/suggestions/1589255-bug-removing-entries-from-bibtex

    ReplyDelete
  8. This is really making it difficult for me to completely adopt Mendeley. If I'm going to have something manage all of my research, I really need it to produce a decent bibtex file. Any news from the Mendeley end?

    ReplyDelete
  9. Hi,
    it seems I found a solution: Delete the documents in the trash! They are also used for bibtex.

    ReplyDelete
  10. Yes, don't forget to delete documents in your trash! It also solved all duplication problems for me.

    Not a bug, maybe just an undesired feature, or need to make users more aware.

    ReplyDelete
  11. There is also a problem that bitex entries for documents that are both in your library and in one (or more) of your groups are doubled.

    The integration/synchronisation of collections and groups leaves a lot to be desired...

    ReplyDelete
  12. Thanks, that trick saved me a lot of time!

    ReplyDelete