I have downloaded a 2016 ‘dump’ of the marxists.org website. I was just wondering if anyone wants to help me download all the pdf/epubs that are missing and getting all the pdf/epubs out of the dump (as it is not well organized) and put it all in neatly organized folders and into a .torrent for anyone to download. If anyone wants to help me with this endavour or any tips on ways to do it i will greatly apreciate it. i am broke asf i cant pay for wage slaves sorry

The reason why i want to do this is to make it easy and accesible for anyone to download any marxist works, as even though i do like the website i think there should be a way for anyone to just download everything.

please dont take this too harshly if i missed something or said something wrong, it is my first post.

  • knfrmity@lemmygrad.ml
    link
    fedilink
    English
    arrow-up
    7
    ·
    1 month ago

    I tried something like this before. Started with a curl command or script which would follow every internal link in a page to recursively download the websjte. That was annoying cause there can be a lot of extra elements you don’t really need. Then I tried ArchiveBox which was a bit op for my purposes but may work well here.

    • klepti@lemmygrad.mlOP
      link
      fedilink
      arrow-up
      5
      ·
      1 month ago

      im currently trying out wget, officially supported so thats good, but struggling a bit with space n stuff, will probably end up just do a selective download

      • haui@lemmygrad.ml
        link
        fedilink
        arrow-up
        3
        ·
        1 month ago

        Whats the problem with space? Do you need patterns for the addresses? I’m possibly in a situation to help. Let me know what you have tried so far. I also am on the matrix server if you want to dm me.