Merge pull request #30 from aliparlakci/SelfDownloader
- Added self post download feature - Made the searching process quicker by writing posts to file at the end of the search - Added long file bug solution to remaining download classes - Updated the README file to make it minimal
This commit is contained in:
80
README.md
80
README.md
@@ -5,23 +5,28 @@ This program downloads imgur, gfycat and direct image and video links of saved p
|
|||||||
|
|
||||||
## Table of Contents
|
## Table of Contents
|
||||||
|
|
||||||
|
- [What it can do?](#what-it-can-do)
|
||||||
- [Requirements](#requirements)
|
- [Requirements](#requirements)
|
||||||
- [Setting up the script](#setting-up-the-script)
|
- [Setting up the script](#setting-up-the-script)
|
||||||
- [Creating an imgur app](#creating-an-imgur-app)
|
- [Creating an imgur app](#creating-an-imgur-app)
|
||||||
- [Program Modes](#program-modes)
|
- [Program Modes](#program-modes)
|
||||||
- [saved mode](#saved-mode)
|
|
||||||
- [submitted mode](#submitted-mode)
|
|
||||||
- [upvoted mode](#upvoted-mode)
|
|
||||||
- [subreddit mode](#subreddit-mode)
|
|
||||||
- [multireddit mode](#multireddit-mode)
|
|
||||||
- [link mode](#link-mode)
|
|
||||||
- [log read mode](#log-read-mode)
|
|
||||||
- [Running the script](#running-the-script)
|
- [Running the script](#running-the-script)
|
||||||
- [Using the command line arguments](#using-the-command-line-arguments)
|
- [Using the command line arguments](#using-the-command-line-arguments)
|
||||||
- [Examples](#examples)
|
- [Examples](#examples)
|
||||||
- [FAQ](#faq)
|
- [FAQ](#faq)
|
||||||
- [Changelog](#changelog)
|
- [Changelog](#changelog)
|
||||||
- [release-1.0.0](#release-100)
|
|
||||||
|
## What it can do?
|
||||||
|
### It...
|
||||||
|
- can get posts from: frontpage, subreddits, multireddits, redditor's submissions, upvoted and saved posts; search results or just plain reddit links
|
||||||
|
- sorts post by hot, top, new and so on
|
||||||
|
- downloads imgur albums, gfycat links, [self posts](#i-can-t-open-the-self-posts-) and any link to a direct image
|
||||||
|
- skips the existing ones
|
||||||
|
- puts post titles to file's name
|
||||||
|
- puts every post to its subreddit's folder
|
||||||
|
- saves reusable a copy of posts' details that are found so that they can be re-downloaded again
|
||||||
|
- logs failed ones in a file to so that you can try to download them later
|
||||||
|
- can be run with double-clicking on Windows (but I don't recommend it)
|
||||||
|
|
||||||
## Requirements
|
## Requirements
|
||||||
- Python 3.x*
|
- Python 3.x*
|
||||||
@@ -49,38 +54,27 @@ It should redirect to a page which shows your **imgur_client_id** and **imgur_cl
|
|||||||
|
|
||||||
## Program Modes
|
## Program Modes
|
||||||
All the program modes are activated with command-line arguments as shown [here](#using-the-command-line-arguments)
|
All the program modes are activated with command-line arguments as shown [here](#using-the-command-line-arguments)
|
||||||
### saved mode
|
- **saved mode**
|
||||||
In saved mode, the program gets posts from given user's saved posts.
|
- Gets posts from given user's saved posts.
|
||||||
### submitted mode
|
- **submitted mode**
|
||||||
In submitted mode, the program gets posts from given user's submitted posts.
|
- Gets posts from given user's submitted posts.
|
||||||
### upvoted mode
|
- **upvoted mode**
|
||||||
In submitted mode, the program gets posts from given user's upvoted posts.
|
- Gets posts from given user's upvoted posts.
|
||||||
### subreddit mode
|
- **subreddit mode**
|
||||||
In subreddit mode, the program gets posts from given subreddits* that is sorted by given type and limited by given number.
|
- Gets posts from given subreddit or subreddits that is sorted by given type and limited by given number.
|
||||||
|
- You may also use search in this mode. See [`py -3 script.py --help`](#using-the-command-line-arguments).
|
||||||
Multiple subreddits can be given
|
- **multireddit mode**
|
||||||
|
- Gets posts from given user's given multireddit that is sorted by given type and limited by given number.
|
||||||
*You may also use search in this mode. See [`py -3 script.py --help`](#using-the-command-line-arguments).*
|
- **link mode**
|
||||||
### multireddit mode
|
- Gets posts from given reddit link.
|
||||||
In multireddit mode, the program gets posts from given user's given multireddit that is sorted by given type and limited by given number.
|
- You may customize the behaviour with `--sort`, `--time`, `--limit`.
|
||||||
### link mode
|
- You may also use search in this mode. See [`py -3 script.py --help`](#using-the-command-line-arguments).
|
||||||
In link mode, the program gets posts from given reddit link.
|
- **log read mode**
|
||||||
|
- Takes a log file which created by itself (json files), reads posts and tries downloading them again.
|
||||||
You may customize the behaviour with `--sort`, `--time`, `--limit`.
|
- Running log read mode for FAILED.json file once after the download is complete is **HIGHLY** recommended as unexpected problems may occur.
|
||||||
|
|
||||||
*You may also use search in this mode. See [`py -3 script.py --help`](#using-the-command-line-arguments).*
|
|
||||||
|
|
||||||
## log read mode
|
|
||||||
Two log files are created each time *script.py* runs.
|
|
||||||
- **POSTS** Saves all the posts without filtering.
|
|
||||||
- **FAILED** Keeps track of posts that are tried to be downloaded but failed.
|
|
||||||
|
|
||||||
In log mode, the program takes a log file which created by itself, reads posts and tries downloading them again.
|
|
||||||
|
|
||||||
Running log read mode for FAILED.json file once after the download is complete is **HIGHLY** recommended as unexpected problems may occur.
|
|
||||||
|
|
||||||
## Running the script
|
## Running the script
|
||||||
**WARNING** *DO NOT* let more than *1* instance of script run as it interferes with IMGUR Request Rate.
|
**DO NOT** let more than one instance of the script run as it interferes with IMGUR Request Rate.
|
||||||
|
|
||||||
### Using the command line arguments
|
### Using the command line arguments
|
||||||
If no arguments are passed program will prompt you for arguments below which means you may start up the script with double-clicking on it (at least on Windows for sure).
|
If no arguments are passed program will prompt you for arguments below which means you may start up the script with double-clicking on it (at least on Windows for sure).
|
||||||
@@ -89,7 +83,7 @@ Open up the [terminal](https://www.reddit.com/r/NSFW411/comments/8vtnl8/meta_i_m
|
|||||||
|
|
||||||
Run the script.py file from terminal with command-line arguments. Here is the help page:
|
Run the script.py file from terminal with command-line arguments. Here is the help page:
|
||||||
|
|
||||||
**ATTENTION** Use `.\` for current directory and `..\` for upper directory when using short directories, otherwise it might act weird.
|
Use `.\` for current directory and `..\` for upper directory when using short directories, otherwise it might act weird.
|
||||||
|
|
||||||
```console
|
```console
|
||||||
$ py -3 script.py --help
|
$ py -3 script.py --help
|
||||||
@@ -166,6 +160,10 @@ py -3 script.py C:\\NEW_FOLDER\\ANOTHER_FOLDER --log UNNAMED_FOLDER\\FAILED.json
|
|||||||
### I can't startup the script no matter what.
|
### I can't startup the script no matter what.
|
||||||
- Try `python3` or `python` or `py -3` as python have real issues about naming their program
|
- Try `python3` or `python` or `py -3` as python have real issues about naming their program
|
||||||
|
|
||||||
|
### I can't open the self posts.
|
||||||
|
- Self posts are held at subreddit as Markdown. So, the script downloads them as Markdown in order not to lose their stylings. However, there is a great Chrome extension [here](https://chrome.google.com/webstore/detail/markdown-viewer/ckkdlimhmcjmikdlpkmbgfkaikojcbjk) for viewing Markdown files with its styling. Install it and open the files with Chrome.
|
||||||
|
|
||||||
## Changelog
|
## Changelog
|
||||||
### v1.0.0
|
### 10/07/2018
|
||||||
- Initial release
|
- Added support for *self* post
|
||||||
|
- Now getting posts is quicker
|
||||||
|
|||||||
19
script.py
19
script.py
@@ -11,7 +11,7 @@ import sys
|
|||||||
import time
|
import time
|
||||||
from pathlib import Path, PurePath
|
from pathlib import Path, PurePath
|
||||||
|
|
||||||
from src.downloader import Direct, Gfycat, Imgur
|
from src.downloader import Direct, Gfycat, Imgur, Self
|
||||||
from src.parser import LinkDesigner
|
from src.parser import LinkDesigner
|
||||||
from src.searcher import getPosts
|
from src.searcher import getPosts
|
||||||
from src.tools import (GLOBAL, createLogFile, jsonFile, nameCorrector,
|
from src.tools import (GLOBAL, createLogFile, jsonFile, nameCorrector,
|
||||||
@@ -451,7 +451,22 @@ def download(submissions):
|
|||||||
print(exception)
|
print(exception)
|
||||||
FAILED_FILE.add({int(i+1):[str(exception),submissions[i]]})
|
FAILED_FILE.add({int(i+1):[str(exception),submissions[i]]})
|
||||||
downloadedCount -= 1
|
downloadedCount -= 1
|
||||||
|
|
||||||
|
elif submissions[i]['postType'] == 'self':
|
||||||
|
print("SELF")
|
||||||
|
try:
|
||||||
|
Self(directory,submissions[i])
|
||||||
|
|
||||||
|
except FileAlreadyExistsError:
|
||||||
|
print("It already exists")
|
||||||
|
downloadedCount -= 1
|
||||||
|
duplicates += 1
|
||||||
|
|
||||||
|
except Exception as exception:
|
||||||
|
print(exception)
|
||||||
|
FAILED_FILE.add({int(i+1):[str(exception),submissions[i]]})
|
||||||
|
downloadedCount -= 1
|
||||||
|
|
||||||
else:
|
else:
|
||||||
print("No match found, skipping...")
|
print("No match found, skipping...")
|
||||||
downloadedCount -= 1
|
downloadedCount -= 1
|
||||||
|
|||||||
@@ -1,3 +1,4 @@
|
|||||||
|
import io
|
||||||
import os
|
import os
|
||||||
import sys
|
import sys
|
||||||
import urllib.request
|
import urllib.request
|
||||||
@@ -16,7 +17,7 @@ except ModuleNotFoundError:
|
|||||||
install("imgurpython")
|
install("imgurpython")
|
||||||
from imgurpython import *
|
from imgurpython import *
|
||||||
|
|
||||||
|
VanillaPrint = print
|
||||||
print = printToFile
|
print = printToFile
|
||||||
|
|
||||||
def dlProgress(count, blockSize, totalSize):
|
def dlProgress(count, blockSize, totalSize):
|
||||||
@@ -294,3 +295,45 @@ class Direct:
|
|||||||
tempDir = directory / (POST['postId']+".tmp")
|
tempDir = directory / (POST['postId']+".tmp")
|
||||||
|
|
||||||
getFile(fileDir,tempDir,POST['postURL'])
|
getFile(fileDir,tempDir,POST['postURL'])
|
||||||
|
|
||||||
|
class Self:
|
||||||
|
def __init__(self,directory,post):
|
||||||
|
if not os.path.exists(directory): os.makedirs(directory)
|
||||||
|
|
||||||
|
title = nameCorrector(post['postTitle'])
|
||||||
|
print(title+"_"+post['postId']+".md")
|
||||||
|
|
||||||
|
fileDir = title+"_"+post['postId']+".md"
|
||||||
|
fileDir = directory / fileDir
|
||||||
|
|
||||||
|
if Path.is_file(fileDir):
|
||||||
|
raise FileAlreadyExistsError
|
||||||
|
|
||||||
|
try:
|
||||||
|
self.writeToFile(fileDir,post)
|
||||||
|
except FileNotFoundError:
|
||||||
|
fileDir = post['postId']+".md"
|
||||||
|
fileDir = directory / fileDir
|
||||||
|
|
||||||
|
self.writeToFile(fileDir,post)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def writeToFile(directory,post):
|
||||||
|
|
||||||
|
content = ("## ["
|
||||||
|
+ post["postTitle"]
|
||||||
|
+ "]("
|
||||||
|
+ post["postURL"]
|
||||||
|
+ ")\n"
|
||||||
|
+ post["postContent"]
|
||||||
|
+ "\n\n---\n\n"
|
||||||
|
+ "submitted by [u/"
|
||||||
|
+ post["postSubmitter"]
|
||||||
|
+ "](https://www.reddit.com/user/"
|
||||||
|
+ post["postSubmitter"]
|
||||||
|
+ ")")
|
||||||
|
|
||||||
|
with io.open(directory,"w",encoding="utf-8") as FILE:
|
||||||
|
VanillaPrint(content,file=FILE)
|
||||||
|
|
||||||
|
print("Downloaded")
|
||||||
|
|||||||
@@ -308,6 +308,10 @@ def redditSearcher(posts,SINGLE_POST=False):
|
|||||||
imgurCount = 0
|
imgurCount = 0
|
||||||
global directCount
|
global directCount
|
||||||
directCount = 0
|
directCount = 0
|
||||||
|
global selfCount
|
||||||
|
selfCount = 0
|
||||||
|
|
||||||
|
allPosts = {}
|
||||||
|
|
||||||
postsFile = createLogFile("POSTS")
|
postsFile = createLogFile("POSTS")
|
||||||
|
|
||||||
@@ -356,13 +360,15 @@ def redditSearcher(posts,SINGLE_POST=False):
|
|||||||
printSubmission(submission,subCount,orderCount)
|
printSubmission(submission,subCount,orderCount)
|
||||||
subList.append(details)
|
subList.append(details)
|
||||||
|
|
||||||
postsFile.add({subCount:[details]})
|
allPosts = {**allPosts,**details}
|
||||||
|
|
||||||
|
postsFile.add(allPosts)
|
||||||
|
|
||||||
if not len(subList) == 0:
|
if not len(subList) == 0:
|
||||||
print(
|
print(
|
||||||
"\nTotal of {} submissions found!\n"\
|
"\nTotal of {} submissions found!\n"\
|
||||||
"{} GFYCATs, {} IMGURs and {} DIRECTs\n"
|
"{} GFYCATs, {} IMGURs, {} DIRECTs and {} SELF POSTS\n"
|
||||||
.format(len(subList),gfycatCount,imgurCount,directCount)
|
.format(len(subList),gfycatCount,imgurCount,directCount,selfCount)
|
||||||
)
|
)
|
||||||
return subList
|
return subList
|
||||||
else:
|
else:
|
||||||
@@ -372,6 +378,7 @@ def checkIfMatching(submission):
|
|||||||
global gfycatCount
|
global gfycatCount
|
||||||
global imgurCount
|
global imgurCount
|
||||||
global directCount
|
global directCount
|
||||||
|
global selfCount
|
||||||
|
|
||||||
try:
|
try:
|
||||||
details = {'postId':submission.id,
|
details = {'postId':submission.id,
|
||||||
@@ -397,13 +404,15 @@ def checkIfMatching(submission):
|
|||||||
imgurCount += 1
|
imgurCount += 1
|
||||||
return details
|
return details
|
||||||
|
|
||||||
elif isDirectLink(submission.url) is True:
|
elif isDirectLink(submission.url):
|
||||||
details['postType'] = 'direct'
|
details['postType'] = 'direct'
|
||||||
directCount += 1
|
directCount += 1
|
||||||
return details
|
return details
|
||||||
|
|
||||||
elif submission.is_self:
|
elif submission.is_self:
|
||||||
details['postType'] = 'self'
|
details['postType'] = 'self'
|
||||||
|
details['postContent'] = submission.selftext
|
||||||
|
selfCount += 1
|
||||||
return details
|
return details
|
||||||
|
|
||||||
def printSubmission(SUB,validNumber,totalNumber):
|
def printSubmission(SUB,validNumber,totalNumber):
|
||||||
|
|||||||
Reference in New Issue
Block a user