A command line (CLI) program for monitoring and downloading 8chan threads. Licensed under MIT.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Abrax 0bd4612ce8 Rewrite version labels. 2 years ago
dist Release v0.2 2 years ago
.gitignore Now tracking built packages. 2 years ago
License.txt Now licensed under MIT. 2 years ago
README.md Rewrite version labels. 2 years ago
build.sh Release v0.2 2 years ago
db_methods.py Fixes #30 2 years ago
db_model.py All version strings are automatically obtained from the current install + more informative useragent. 2 years ago
file_io.py Refactored a lot of the file IO code so that all paths are defined in a centralized way. Fixes #17 2 years ago
install.sh Release v0.2 2 years ago
json_methods.py Moved things around to get rid of "logic" directory since that seemed to be causing problems. 2 years ago
lizard.py Fixes #31 2 years ago
setup.py Increment version to 0.3 2 years ago
time_methods.py Moved things around to get rid of "logic" directory since that seemed to be causing problems. 2 years ago
web_methods.py Importing version from lizard in web_methods was causing a circular import (see http://stackoverflow.com/a/22210807). Importing the version from web_methods is inane, importing from a dedicated version file creates a useless file. So I guess we'll duplicate code... It shouldn't change unless I change *how* I obtain the version, anyway. 2 years ago

README.md

Lizard, the 8ch monitor

Lizard is a command line (CLI) program for monitoring 8ch threads. After you add a thread to its watchlist, it can connect to 8ch and check if new replies were made, notifying you if so. It will try to keep a local copy of the thread and the files in it in case the thread dies. It can also open all threads with new replies in your browser with a single command.

Changelog

0.3 (in development)

This version will change syntax for some commands:

  • Database creation: lizard c becomes lizard create
  • Conservative refresh:
    • lizard rc no longer a valid command
    • lizard r does conservative refresh
  • Refresh all threads: lizard ra instead of lizard r

These changes take effect with version 0.3 so be ready to update your scripts!

0.2 (current)

  • Help message now shows version. (issue #24)
  • New command: rl is refresh + list
  • New command: ro is refresh + open in browser
  • c and b now print more useful output.
  • Better handling of non-existent/invalid database.

0.1.1

  • Now explicitly specifying python3 in script header, so it shouldn’t attempt running with python2.
  • lizard e output format matches the new packaged system.

0.1

  • Introduced package system.

Installation

Linux

  1. Download the most recent package from under dist/
  2. Extract this into your root directory: sudo tar -zvxf lizard-<VERSION HERE>.linux-x86_64.tar.gz -C /

If you don’t have the most recent Python3 installed and run into a problem, update your Python3 installation and try again.

Upgrading

If you’ve been using Lizard for a while and are upgrading to a new release, be very careful with your local data. When a new version changes the database code or how local files are stored, it could fail to read your old shit and even corrupt it. I try to note this somewhere when it happens, but I may forget or you may not see the announcement. If in doubt, backup everything (all of Lizard’s data is stored under ~/lizard_data/) before trying the new version.

If you know for sure that a version breaks backwards compatibility, you have the following options:

  • Just fuck it and start over with fresh database.

  • Run lizard e in your old lizard, save the resulting file, install the new version and then re-construct your watch list by running the exported commands. If you had any unseen replies or something you might lose those, so I’d check first with lizard r and lizard l. It might also re-download the old files if I changed where the files are stored. If you can figure out where they are supposed to go you can copy them over to avoid the re-downloading.

  • Open the databases in another program (see FAQ) and manually migrate the data.

  • Beg me to make a migration script.

Running sources directly

You can also run the .py files in the repo directly. If you want to do this I’m assuming you have enough proficiency with Python to figure it out. FYI, I use PyCharm to develop so it might be easier to get it to work with it.

Usage

Running lizard without any arguments will print the help message explaining the syntax. It will also create a ~/lizard_data directory to store files.

After you install, run lizard c to create a new database (if you don’t have one already). Everything else will crash unless a valid database exists.

Add a thread to the database with lizard a <URL>. Upon adding a thread, Lizard will immediately download a copy of the thread and the files in it. These will be put under ~/lizard_data/.

Troubleshooting

Make sure you have Python 3.6 and are using it to run Linux. Python 2 will absolutely not run it, and the libraries I use may cause problems with older versions of Python 3.

I generate the release archive with Python’s distutils. I have no idea how it deals with dependencies on other Python packages. You can install these with pip install <package name>. Off the top of my head you need peewee and humanize. I also use os, shutil, webbrowser, json, requests and re but these should be included in Python3 by default.

Installation problems

Untarring the archive should produce something like this:

$ sudo tar -zvxf dist/lizard-0.1.1.linux-x86_64.tar.gz -C /
./
./usr/
./usr/lib/
./usr/lib/python3.6/
./usr/lib/python3.6/site-packages/
./usr/lib/python3.6/site-packages/db_methods.py
./usr/lib/python3.6/site-packages/constants.py
./usr/lib/python3.6/site-packages/__pycache__/
./usr/lib/python3.6/site-packages/__pycache__/json_methods.cpython-36.pyc
./usr/lib/python3.6/site-packages/__pycache__/db_model.cpython-36.pyc
./usr/lib/python3.6/site-packages/__pycache__/lizard.cpython-36.pyc
./usr/lib/python3.6/site-packages/__pycache__/time_methods.cpython-36.pyc
./usr/lib/python3.6/site-packages/__pycache__/constants.cpython-36.pyc
./usr/lib/python3.6/site-packages/__pycache__/file_io.cpython-36.pyc
./usr/lib/python3.6/site-packages/__pycache__/web_methods.cpython-36.pyc
./usr/lib/python3.6/site-packages/__pycache__/db_methods.cpython-36.pyc
./usr/lib/python3.6/site-packages/file_io.py
./usr/lib/python3.6/site-packages/lizard.py
./usr/lib/python3.6/site-packages/web_methods.py
./usr/lib/python3.6/site-packages/db_model.py
./usr/lib/python3.6/site-packages/json_methods.py
./usr/lib/python3.6/site-packages/lizard-0.1.1-py3.6.egg-info/
./usr/lib/python3.6/site-packages/lizard-0.1.1-py3.6.egg-info/dependency_links.txt
./usr/lib/python3.6/site-packages/lizard-0.1.1-py3.6.egg-info/PKG-INFO
./usr/lib/python3.6/site-packages/lizard-0.1.1-py3.6.egg-info/top_level.txt
./usr/lib/python3.6/site-packages/lizard-0.1.1-py3.6.egg-info/requires.txt
./usr/lib/python3.6/site-packages/lizard-0.1.1-py3.6.egg-info/SOURCES.txt
./usr/lib/python3.6/site-packages/lizard-0.1.1-py3.6.egg-info/entry_points.txt
./usr/lib/python3.6/site-packages/time_methods.py
./usr/bin/
./usr/bin/lizard
$ 

As you can see this puts a link to the program under /usr/bin/lizard and all the program code under /usr/lib/python3.6/site-packages/ . Confirm that the files are actually there, if not, first figure out where they went, delete them, and then manually extract and move the archive contents to these locations. The archive is generated with absolute paths as they should be on your system, so consult these (or the above list) when verifying that you’ve got everything where it needs to be.

Alternatively you can try running the scripts with the interpreter. Just clone or download the repo, then navigate to the directory, and run python3 lizard.py. If that prints the help message then try other troubleshooting steps except replace lizard with python3 lizard.py in the commands.

Does Lizard work at all?

Simply running lizard without any commands will print a help message (and also ensure that ~/lizard_data/ exists). If you can’t get this to happen, the problem is probably your environment.

Next, try to run lizard c. It should create a new database, but if one already exists you need to get rid of it (either delete or rename). lizard b will back up the DB for you, but it won’t erase the current one. Make sure the database it creates is ~/lizard_data/threads.orm.db and not somewhere else.

If lizard c is able to create a database, then see if you can add a thread. Try lizard a <URL of a thread> for a few different threads. If none of them worked, again, probably something is fucked in your Lizard installation. If all worked, great! Lizard should be working correctly. If only some threads work, it’s probably because of a bug specific to those I didn’t know about, submit an issue and include steps to reproduce.

Corrupted database

If your existing threads.orm.db file causes problems, but a fresh one works fine, either your database is just fucked or one of the threads in it is causing problems. You can elminate the latter possibility by re-adding the threads to the fresh database. If there’s a thread that consistently causes problems, congratulations, you found a bug! Submit an issue and wait for me to fix it, meanwhile don’t add that thread. If you manage to re-add all the threads to the fresh database and everything works, just delete the old database and forget about it. Life’s too short, man.

FAQ

Is lizard only for Linux?

It’s easy for me to build the Linux package, so I only supply that.

Python is supposed to be cross-platform so it should work on other OSes too. If anyone is actually interested in running it on something else, I can try to build for that also.

Is there a GUI?

No. An earlier version of this program had a GUI, but it turned out to be more trouble than it’s worth.

Why doesn’t Lizard automatically refresh threads at set intervals?

The user is expected to use the scheduling system of their OS to accomplish this.

For instance, on Linux you can set up cron to run lizard rc regularly, and maybe redirect output to your own log file as well.

How can I inspect the database?

lizard l will print out a summary, but if you want the nitty gritty you need a SQLite browser such as sqliteman or DB Browser for SQLite.

Why does it create so many .json and .html files in the data directory?

These are saved at each refresh in case the thread dies or any replies are deleted. Deleting these files when they are no longer needed is the user’s responsibility.

How do you handle file names?

I use the original filenames with the first 10 characters of 8chan’s hash-based name. Adding the hash is necessary because 8ch allows duplicate filenames in the same thread, and they would otherwise conflict with each other when saved with the original name.

How does the conservative refresh predict replies?

If the last reply was x seconds ago, and the thread was last refreshed y seconds ago, lizard doesn’t bother refreshing unless x/y>10. Internally, the 10 is called conservative_refresh_criterion. Increasing this will make it less conservative, decreasing will make it more conservative.

In reality I think replies would follow a bimodal (people who browse only the front page and people who browse the catalog) or trimodal (bimodal plus people who watch individual threads, such as with this tool) distribution, with each mode itself being a Poisson. Interesting as that is, I think my simple heuristic works well enough so I’m not planning to implement such rigorous statistical analysis for the time being.