Strip unicode characters from filenames in license text

author Oliver Matthews <oliver@codersoffortune.net>

Tue, 14 Apr 2020 10:21:44 +0000 (11:21 +0100)

committer Oliver Matthews <oliver@codersoffortune.net>

Tue, 14 Apr 2020 10:21:44 +0000 (11:21 +0100)
author Oliver Matthews <oliver@codersoffortune.net>
Tue, 14 Apr 2020 10:21:44 +0000 (11:21 +0100)
committer Oliver Matthews <oliver@codersoffortune.net>
Tue, 14 Apr 2020 10:21:44 +0000 (11:21 +0100)
diff --git a/README.md b/README.md

index 703b00a..2b3b4ef 100644 (file)
--- a/README.md
+++ b/README.md
@@ -85,6 +85,8 @@ python3, beautifulsoup4, requests, lxml
  - If there is an updated file, the old directory will be moved to `name_timestamp` where `timestamp` is the last upload time of the old files. The code will then copy unchanged files across and download any new ones.
  
  ## Changelog
+* v0.8.3
+  - Strip unicode characters from license text
  * v0.8.2
    - Strip unicode characters from filenames
  * v0.8.1
diff --git a/thingy_grabber.py b/thingy_grabber.py

index 9942b9a..552dd5d 100755 (executable)
--- a/thingy_grabber.py
+++ b/thingy_grabber.py
@@ -37,7 +37,7 @@ NO_WHITESPACE_REGEX = re.compile(r'[-\s]+')
  DOWNLOADER_COUNT = 1
  RETRY_COUNT = 3
  
-VERSION = "0.8.2"
+VERSION = "0.8.3"
  
  
  #BROWSER = webdriver.PhantomJS('./phantomjs')
@@ -332,7 +332,7 @@ class Thing:
                  logging.error(link_date)
  
          self._image_links=[x.find_element_by_xpath(".//img").get_attribute("src") for x in pc.images]
-        self._license = pc.license
+        self._license = strip_invalid_chars(pc.license)
          self.pc = pc
author	Oliver Matthews <oliver@codersoffortune.net>
	Tue, 14 Apr 2020 10:21:44 +0000 (11:21 +0100)
committer	Oliver Matthews <oliver@codersoffortune.net>
	Tue, 14 Apr 2020 10:21:44 +0000 (11:21 +0100)
README.md		patch \| blob \| blame \| history
thingy_grabber.py		patch \| blob \| blame \| history