| 1 | # thingy_grabber |
| 2 | Script for archiving thingiverse things. Due to this being a glorified webscraper, it's going to be very fragile. |
| 3 | |
| 4 | ## Usage: |
| 5 | ```` |
| 6 | usage: thingy_grabber.py [-h] [-l {debug,info,warning}] [-d DIRECTORY] [-f LOG_FILE] {collection,thing,user,batch,version} ... |
| 7 | |
| 8 | positional arguments: |
| 9 | {collection,thing,user,batch,version} |
| 10 | Type of thing to download |
| 11 | collection Download one or more entire collection(s) |
| 12 | thing Download a single thing. |
| 13 | user Download all things by one or more users |
| 14 | batch Perform multiple actions written in a text file |
| 15 | version Show the current version |
| 16 | |
| 17 | optional arguments: |
| 18 | -h, --help show this help message and exit |
| 19 | -l {debug,info,warning}, --log-level {debug,info,warning} |
| 20 | level of logging desired |
| 21 | -d DIRECTORY, --directory DIRECTORY |
| 22 | Target directory to download into |
| 23 | -f LOG_FILE, --log-file LOG_FILE |
| 24 | Place to log debug information to |
| 25 | ```` |
| 26 | |
| 27 | ### Things |
| 28 | `thingy_grabber.py thing thingid1 thingid2 ...` |
| 29 | This will create a directory named after the title of the thing(s) with the given ID(s) and download the files into it. |
| 30 | |
| 31 | ### Collections |
| 32 | `thingy_grabber.py collection user_name collection_name1 collection_name2` |
| 33 | Where `user_name` is the name of the creator of the collection (not nes. your name!) and `collection_name1...etc` are the name(s) of the collection(s) you want. |
| 34 | |
| 35 | This will create a series of directorys `user-collection/thing-name` for each thing in the collection. |
| 36 | |
| 37 | If for some reason a download fails, it will get moved sideways to `thing-name-failed` - this way if you rerun it, it will only reattmpt any failed things. |
| 38 | |
| 39 | ### User designs |
| 40 | `thingy_grabber.py user user_name1, user_name2..` |
| 41 | Where `user_name1.. ` are the names of creator. |
| 42 | |
| 43 | This will create a series of directories `user designs/thing-name` for each thing that user has designed. |
| 44 | |
| 45 | If for some reason a download fails, it will get moved sideways to `thing-name-failed` - this way if you rerun it, it will only reattmpt any failed things. |
| 46 | |
| 47 | ### Batch mode |
| 48 | `thingy_grabber.py batch batch_file` |
| 49 | This will load a given text file and parse it as a series of calls to this script. The script should be of the form `command arg1 ...`. |
| 50 | Be warned that there is currently NO validation that you have given a correct set of commands! |
| 51 | |
| 52 | An example: |
| 53 | ```` |
| 54 | thing 3670144 |
| 55 | collection cwoac bike |
| 56 | user cwoac |
| 57 | ```` |
| 58 | |
| 59 | If you are using linux, you can just add an appropriate call to the crontab. If you are using windows, it's a bit more of a faff, but at least according to [https://www.technipages.com/scheduled-task-windows](this link), you should be able to with a command something like this (this is not tested!): `schtasks /create /tn thingy_grabber /tr "c:\path\to\thingy_grabber.py -d c:\path\to\output\directory batch c:\path\to\batchfile.txt" /sc weekly /d wed /st 13:00:00` |
| 60 | You may have to play with the quotation marks to make that work though. |
| 61 | |
| 62 | ## Examples |
| 63 | `thingy_grabber.py collection cwoac bike` |
| 64 | Download the collection 'bike' by the user 'cwoac' |
| 65 | `thingy_grabber.py -d downloads -l warning thing 1234 4321 1232` |
| 66 | Download the three things 1234, 4321 and 1232 into the directory downloads. Only give warnings. |
| 67 | `thingy_grabber.py -d c:\downloads -l debug user jim bob` |
| 68 | Download all designs by jim and bob into directories under `c:\downloads`, give lots of debug messages |
| 69 | ` |
| 70 | |
| 71 | ## Requirements |
| 72 | python3, beautifulsoup4, requests, lxml |
| 73 | |
| 74 | ## Current features: |
| 75 | - can download an entire collection, creating seperate subdirs for each thing in the collection |
| 76 | - If you run it again with the same settings, it will check for updated files and only update what has changed. This should make it suitible for syncing a collection on a cronjob |
| 77 | - If there is an updated file, the old directory will be moved to `name_timestamp` where `timestamp` is the last upload time of the old files. The code will then copy unchanged files across and download any new ones. |
| 78 | |
| 79 | ## Changelog |
| 80 | * v0.6.3 |
| 81 | - Caught edge case involving old dir clashes |
| 82 | - Add support for seperate log file |
| 83 | * v0.6.2 |
| 84 | - Added catches for 404s, 504s and malformed pages |
| 85 | * v0.6.1 |
| 86 | - now downloads readme.txt and licence details |
| 87 | * v0.6.0 |
| 88 | - added support for downloading multiple things/design sets/collections from the command line |
| 89 | * v0.5.0 |
| 90 | - better logging options |
| 91 | - batch mode |
| 92 | * v0.4.0 |
| 93 | - Added a changelog |
| 94 | - Now download associated images |
| 95 | - support `-d` to specify base download directory |
| 96 | |
| 97 | ## Todo features (maybe): |
| 98 | - attempt to use -failed dirs for resuming |
| 99 | - gui? |
| 100 | |