Commit | Line | Data |
---|---|---|
4f6e317c | 1 | # thingy_grabber |
975060c9 OM |
2 | Script for archiving thingiverse things. Due to this being a glorified webscraper, it's going to be very fragile. |
3 | ||
4 | ## Usage: | |
4a98996b | 5 | ```` |
1ab49020 | 6 | usage: thingy_grabber.py [-h] [-l {debug,info,warning}] [-d DIRECTORY] {collection,thing,user,batch,version} ... |
a7152c35 | 7 | |
4a98996b | 8 | positional arguments: |
1ab49020 | 9 | {collection,thing,user,batch,version} |
3522a3bf | 10 | Type of thing to download |
b7bfef68 | 11 | collection Download one or more entire collection(s) |
3522a3bf | 12 | thing Download a single thing. |
b7bfef68 | 13 | user Download all things by one or more users |
1ab49020 | 14 | batch Perform multiple actions written in a text file |
db8066ec | 15 | version Show the current version |
4a98996b OM |
16 | |
17 | optional arguments: | |
3522a3bf | 18 | -h, --help show this help message and exit |
fa2f3251 OM |
19 | -l {debug,info,warning}, --log-level {debug,info,warning} |
20 | level of logging desired | |
d66f1f78 OM |
21 | -d DIRECTORY, --directory DIRECTORY |
22 | Target directory to download into | |
4a98996b | 23 | ```` |
3522a3bf | 24 | |
4a98996b | 25 | ### Things |
b7bfef68 OM |
26 | `thingy_grabber.py thing thingid1 thingid2 ...` |
27 | This will create a directory named after the title of the thing(s) with the given ID(s) and download the files into it. | |
4a98996b | 28 | |
4a98996b | 29 | ### Collections |
b7bfef68 OM |
30 | `thingy_grabber.py collection user_name collection_name1 collection_name2` |
31 | Where `user_name` is the name of the creator of the collection (not nes. your name!) and `collection_name1...etc` are the name(s) of the collection(s) you want. | |
975060c9 | 32 | |
a7152c35 | 33 | This will create a series of directorys `user-collection/thing-name` for each thing in the collection. |
a7152c35 OM |
34 | |
35 | If for some reason a download fails, it will get moved sideways to `thing-name-failed` - this way if you rerun it, it will only reattmpt any failed things. | |
36 | ||
3522a3bf | 37 | ### User designs |
b7bfef68 OM |
38 | `thingy_grabber.py user user_name1, user_name2..` |
39 | Where `user_name1.. ` are the names of creator. | |
3522a3bf OM |
40 | |
41 | This will create a series of directories `user designs/thing-name` for each thing that user has designed. | |
42 | ||
43 | If for some reason a download fails, it will get moved sideways to `thing-name-failed` - this way if you rerun it, it will only reattmpt any failed things. | |
44 | ||
1ab49020 OM |
45 | ### Batch mode |
46 | `thingy_grabber.py batch batch_file` | |
47 | This will load a given text file and parse it as a series of calls to this script. The script should be of the form `command arg1 ...`. | |
48 | Be warned that there is currently NO validation that you have given a correct set of commands! | |
49 | ||
50 | An example: | |
51 | ```` | |
52 | thing 3670144 | |
53 | collection cwoac bike | |
54 | user cwoac | |
55 | ```` | |
56 | ||
57 | If you are using linux, you can just add an appropriate call to the crontab. If you are using windows, it's a bit more of a faff, but at least according to [https://www.technipages.com/scheduled-task-windows](this link), you should be able to with a command something like this (this is not tested!): `schtasks /create /tn thingy_grabber /tr "c:\path\to\thingy_grabber.py -d c:\path\to\output\directory batch c:\path\to\batchfile.txt" /sc weekly /d wed /st 13:00:00` | |
58 | You may have to play with the quotation marks to make that work though. | |
59 | ||
b7bfef68 OM |
60 | ## Examples |
61 | `thingy_grabber.py collection cwoac bike` | |
62 | Download the collection 'bike' by the user 'cwoac' | |
63 | `thingy_grabber.py -d downloads -l warning thing 1234 4321 1232` | |
64 | Download the three things 1234, 4321 and 1232 into the directory downloads. Only give warnings. | |
65 | `thingy_grabber.py -d c:\downloads -l debug user jim bob` | |
66 | Download all designs by jim and bob into directories under `c:\downloads`, give lots of debug messages | |
67 | ` | |
68 | ||
975060c9 | 69 | ## Requirements |
c4388960 | 70 | python3, beautifulsoup4, requests, lxml |
975060c9 OM |
71 | |
72 | ## Current features: | |
73 | - can download an entire collection, creating seperate subdirs for each thing in the collection | |
e36c2a07 | 74 | - If you run it again with the same settings, it will check for updated files and only update what has changed. This should make it suitible for syncing a collection on a cronjob |
3c82f75b | 75 | - If there is an updated file, the old directory will be moved to `name_timestamp` where `timestamp` is the last upload time of the old files. The code will then copy unchanged files across and download any new ones. |
975060c9 | 76 | |
680039fe | 77 | ## Changelog |
e0e69fc6 OM |
78 | * v0.6.2 |
79 | - Added catches for 404s, 504s and malformed pages | |
4f75dd69 OM |
80 | * v0.6.1 |
81 | - now downloads readme.txt and licence details | |
b7bfef68 OM |
82 | * v0.6.0 |
83 | - added support for downloading multiple things/design sets/collections from the command line | |
fa2f3251 OM |
84 | * v0.5.0 |
85 | - better logging options | |
1ab49020 | 86 | - batch mode |
680039fe OM |
87 | * v0.4.0 |
88 | - Added a changelog | |
89 | - Now download associated images | |
90 | - support `-d` to specify base download directory | |
e36c2a07 OM |
91 | |
92 | ## Todo features (maybe): | |
1ab49020 | 93 | - log to file support |
975060c9 | 94 | - less perfunctory error checking / handling |
a7152c35 | 95 | - attempt to use -failed dirs for resuming |
1ab49020 | 96 | - gui? |
680039fe | 97 |