Only output file download error text when logging is turned up.
[clinton/thingy_grabber.git] / README.md
CommitLineData
4f6e317c 1# thingy_grabber
e45ba963 2Script for archiving thingiverse things.
975060c9
OM
3
4## Usage:
4a98996b 5````
e45ba963
OM
6usage: thingy_grabber.py [-h] [-l {debug,info,warning}] [-d DIRECTORY] [-f LOG_FILE] [-q] [-c] [-a API_KEY]
7 {collection,thing,user,batch,version} ...
a7152c35 8
4a98996b 9positional arguments:
1ab49020 10 {collection,thing,user,batch,version}
3522a3bf 11 Type of thing to download
b7bfef68 12 collection Download one or more entire collection(s)
3522a3bf 13 thing Download a single thing.
b7bfef68 14 user Download all things by one or more users
1ab49020 15 batch Perform multiple actions written in a text file
db8066ec 16 version Show the current version
4a98996b
OM
17
18optional arguments:
3522a3bf 19 -h, --help show this help message and exit
fa2f3251
OM
20 -l {debug,info,warning}, --log-level {debug,info,warning}
21 level of logging desired
d66f1f78
OM
22 -d DIRECTORY, --directory DIRECTORY
23 Target directory to download into
4f94efc8
OM
24 -f LOG_FILE, --log-file LOG_FILE
25 Place to log debug information to
e052f0f3 26 -q, --quick Assume date ordering on posts
8ed15058 27 -c, --compress Compress files
e45ba963
OM
28 -a API_KEY, --api-key API_KEY
29 API key for thingiverse
4a98996b 30````
3522a3bf 31
e45ba963
OM
32## API KEYs
33Thingy_grabber v0.10.0 accesses thingiverse in a _substantially_ different way to before. The plus side is it should be more reliable, possibly faster and no longer needs selenium or a firefox instance (and so drastically reduces memory overhead). The downside is you are _going_ to have to do something to continue using the app - basically get yourself an API KEY.
34
35To do this, go to https://www.thingiverse.com/apps/create and create your own selecting Desktop app.
36Once you have your key, either specify it on the command line or put it in a text file called `api.key` whereever you are running the script from - the script will auto load it.
37
38### Why can't I use yours?
39Because API keys can (are?) rate limited.
40
72d57c49
OM
41## Downloads
42The latest version can be downloaded from here: https://github.com/cwoac/thingy_grabber/releases/. Under the 'assets' triangle there is precompiled binaries for windows (no python needed!).
43
e441d62b
OM
44## Getting started
45First download the code. Either grab the source, or get the windows binary from above and extract it somewhere. If you are running from source, see `requirements.yaml` for the packages you need. You will also need an API key (as above) and to make a directory to store your downloads in.
46
47oh, and you need to know what you want to download, ofc. It can be either things, collections or just the designs of a user.
48once you have done all this you need to open a command prompt and run it.
49
50Let's say you are running windows and using the precompiled binary and extracted the release to the `thingy_grabber` directory on your desktop and you made a `things` directory in your `Documents` directory.
51When you open the command window, it will start in your home directory (say `c:\Users\cwoac`)
52`cd Desktop\thingy_grabber` to get to `c:\Users\cwoac\Desktop\thingy_grabber` and check that you are right by trying to run `thingy_grabber` - you should get a long list of possible command line options that looks a lot like the list further up.
53Supposing you want to download all of my stuff (for some crazy reason), then the command will look like this
54
55`thingy_grabber -a YOURAPIKEY -d "c:\Users\cwoac\Documents\things" -c user cwoac`
56
57The `-c` will cause the script to compress the download to a 7z file to save space. If you prefer to leave it uncompressed, just omit the `-c`
58That's the basics. Well, acutally, there isn't much more than that to be honest. There is a batch mode so if you create a text file with a list of lines like
59```
60user cwoac
61user solutionlesn
62collection cwoac at2018
63```
64then you can use the `batch` target to run each of these in turn. If you run it a second time with the same options it will only download things which have changed or been added.
65
66## Modes
4a98996b 67### Things
b7bfef68
OM
68`thingy_grabber.py thing thingid1 thingid2 ...`
69This will create a directory named after the title of the thing(s) with the given ID(s) and download the files into it.
4a98996b 70
4a98996b 71### Collections
e052f0f3 72`thingy_grabber.py collection user_name collection_name1 collection_name2`
b7bfef68 73Where `user_name` is the name of the creator of the collection (not nes. your name!) and `collection_name1...etc` are the name(s) of the collection(s) you want.
975060c9 74
a7152c35 75This will create a series of directorys `user-collection/thing-name` for each thing in the collection.
a7152c35
OM
76
77If for some reason a download fails, it will get moved sideways to `thing-name-failed` - this way if you rerun it, it will only reattmpt any failed things.
78
3522a3bf 79### User designs
b7bfef68
OM
80`thingy_grabber.py user user_name1, user_name2..`
81Where `user_name1.. ` are the names of creator.
3522a3bf
OM
82
83This will create a series of directories `user designs/thing-name` for each thing that user has designed.
84
85If for some reason a download fails, it will get moved sideways to `thing-name-failed` - this way if you rerun it, it will only reattmpt any failed things.
86
1ab49020
OM
87### Batch mode
88`thingy_grabber.py batch batch_file`
89This will load a given text file and parse it as a series of calls to this script. The script should be of the form `command arg1 ...`.
90Be warned that there is currently NO validation that you have given a correct set of commands!
91
92An example:
93````
94thing 3670144
95collection cwoac bike
96user cwoac
97````
98
99If you are using linux, you can just add an appropriate call to the crontab. If you are using windows, it's a bit more of a faff, but at least according to [https://www.technipages.com/scheduled-task-windows](this link), you should be able to with a command something like this (this is not tested!): `schtasks /create /tn thingy_grabber /tr "c:\path\to\thingy_grabber.py -d c:\path\to\output\directory batch c:\path\to\batchfile.txt" /sc weekly /d wed /st 13:00:00`
100You may have to play with the quotation marks to make that work though.
101
e052f0f3
OM
102### Quick mode
103All modes now support 'quick mode' (`-q`), although this has no effect for individual item downloads. As thingyverse sorts it's returned items in descending last modified order (I believe), once we have determined that we have the most recent version of a given thing in a collection, we can safely stop processing that collection as we should have _all_ the remaining items in it already. This _substantially_ speeds up the process of keeping big collections up to date and will noticably reduce the server load it generates.
104
105*Warning:* As it stops as soon as it finds an uptodate successful model, if you have unfixed failed downloads further down the list (for want of a better term), they will _not_ be retried.
106
107*Warning:* At the moment I have not conclusively proven to myself that the result is ordered by last updated and not upload time. Once I have verified this, I will probably be making this the default option.
108
b7bfef68
OM
109## Examples
110`thingy_grabber.py collection cwoac bike`
111Download the collection 'bike' by the user 'cwoac'
112`thingy_grabber.py -d downloads -l warning thing 1234 4321 1232`
113Download the three things 1234, 4321 and 1232 into the directory downloads. Only give warnings.
114`thingy_grabber.py -d c:\downloads -l debug user jim bob`
115Download all designs by jim and bob into directories under `c:\downloads`, give lots of debug messages
116`
117
975060c9 118## Requirements
e45ba963 119python3, requests, py7xr (>=0.8.2)
975060c9
OM
120
121## Current features:
122- can download an entire collection, creating seperate subdirs for each thing in the collection
e36c2a07 123- If you run it again with the same settings, it will check for updated files and only update what has changed. This should make it suitible for syncing a collection on a cronjob
3c82f75b 124- If there is an updated file, the old directory will be moved to `name_timestamp` where `timestamp` is the last upload time of the old files. The code will then copy unchanged files across and download any new ones.
975060c9 125
680039fe 126## Changelog
66c327ef
OM
127* v0.10.5
128 - Fixed handling users with >30 things (thanks Clinton).
129 - Added standard contrib and code of conduct files.
e1306099
OM
130* v0.10.4
131 - Readme.txt files are now text files, not HTML files.
132 - removed some debug print statements that I forgot to remove from the last release (oops).
eb7a88fb
OM
133* v0.10.3
134 - Handle trailing whitespace in thing names
135 - Fix raw thing grabbing
10f0238d
OM
136* v0.10.2
137 - Fixed regression in rest API
138* v0.10.1
e6d8def4 139 - A couple of minor bug fixes on exception handling.
e45ba963
OM
140* v0.10.0
141 - API access! new -a option to provide an API key for more stable access.
8ed15058
OM
142* v0.9.0
143 - Compression! New -c option will use 7z to create an archival copy of the file once downloaded.
144 Note that although it will use the presence of 7z files to determine if a file has been updated, it currently _won't_ read old files from inside the 7z for handling updates, resulting in marginally larger bandwidth usage when dealing with partially updated things. This will be fixed later.
145 - Internal tidying of how old directories are handled - I've tested this fairly heavily, but do let me know if there are issues.
9828dabe
OM
146* v0.8.7
147 - Always, Always generate a valid time stamp.
247c2cd5
OM
148* v0.8.6
149 - Handle thingiverse returning no files for a thing gracefully.
65bd8b43
OM
150* v0.8.5
151 - Strip '.'s from the end of filenames
152 - If you fail a download for an already failed download it no longer throws an exception
153 - Truncates paths that are too long for windows
d194b140
OM
154* v0.8.4
155 - Just use unicode filenames - puts the unicode characters back in!
156 - Force selenium to shutdown firefox on assert and normal exit
4b5e35a5
OM
157* v0.8.3
158 - Strip unicode characters from license text
cef8aa7a
OM
159* v0.8.2
160 - Strip unicode characters from filenames
1267e583
OM
161* v0.8.1
162 - Fix bug on when all files were created / updated in October after the 9th.
fb28c59b
OM
163* v0.8.0
164 - Updated to support new thingiverse front end
7b84ba6d
OM
165* v0.7.0
166 - Add new quick mode that stops once it has 'caught up' for a group
4f94efc8
OM
167* v0.6.3
168 - Caught edge case involving old dir clashes
169 - Add support for seperate log file
e0e69fc6
OM
170* v0.6.2
171 - Added catches for 404s, 504s and malformed pages
4f75dd69
OM
172* v0.6.1
173 - now downloads readme.txt and licence details
b7bfef68
OM
174* v0.6.0
175 - added support for downloading multiple things/design sets/collections from the command line
fa2f3251
OM
176* v0.5.0
177 - better logging options
1ab49020 178 - batch mode
680039fe
OM
179* v0.4.0
180 - Added a changelog
181 - Now download associated images
182 - support `-d` to specify base download directory
e36c2a07
OM
183
184## Todo features (maybe):
a7152c35 185- attempt to use -failed dirs for resuming
1ab49020 186- gui?
680039fe 187