Skip to content

Commit

Permalink
add feature: use date range in boundary option #50
Browse files Browse the repository at this point in the history
  • Loading branch information
nondanee committed Dec 15, 2019
1 parent 7cec7ae commit 4406fbc
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 8 deletions.
2 changes: 1 addition & 1 deletion README-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ optional arguments:
- `-r retry` 最大重试次数(默认值:`2`
- `-i interval` 请求间隔(默认值:`1`,单位:秒)
- `-c cookie` 登录凭据 (需要 cookie 中的 `SUB` 值)
- `-b boundary` 微博 midbid 范围(格式:`id:id` 两者之间,`:id` 之前,`id:` 之后,`id` 指定,`:` 全部)
- `-b boundary` 微博 mid/bid 或日期范围(格式:`id:id` 两者之间,`:id` 之前,`id:` 之后,`id` 指定,`:` 全部)
- `-n name` 命名模板 (标识符: `url``index``type``mid``bid``date``text``name`,类似 ["f-Strings"](https://www.python.org/dev/peps/pep-0498/#abstract) 语法)
- `-v` 同时下载秒拍视频
- `-o` 重新下载已保存的文件(默认跳过)
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ Optional arguments
- `-r retry` max retries (default value: `2`)
- `-i interval` request interval (default value: `1`, unit: second)
- `-c cookie` login credential (only need the value of a certain key named `SUB`)
- `-b boundary` mid/bid range of weibos (format: `id:id` between, `:id` before, `id:` after, `id` certain, `:` all)
- `-b boundary` mid/bid/date range of weibos (format: `id:id` between, `:id` before, `id:` after, `id` certain, `:` all)
- `-n name` naming template (identifier: `url`, `index`, `type`, `mid`, `bid`, `date`, `text`, `name`, like ["f-Strings"](https://www.python.org/dev/peps/pep-0498/#abstract) syntax)
- `-v` download miaopai videos at the same time
- `-o` overwrite existing files (skipping if exists for default)
Expand Down
26 changes: 20 additions & 6 deletions weiboPicDownloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,17 @@ def parse_date(text):
elif re.search(r'^[\d|-]+$', text):
return datetime.datetime.strptime(((str(now.year) + '-') if not re.search(r'^\d{4}', text) else '') + text, '%Y-%m-%d').date()

def compare(standard, operation, candidate):
for target in candidate:
try:
result = '>=<'
if standard > target: result = '>'
elif standard == target: result = '='
else: result = '<'
return result in operation
except TypeError:
pass

def get_resources(uid, video, interval, limit):
page = 1
size = 25
Expand Down Expand Up @@ -227,10 +238,11 @@ def get_resources(uid, video, interval, limit):
mblog = card['mblog']
if 'isTop' in mblog and mblog['isTop']: continue
mid = int(mblog['mid'])
mark = {'mid': mid, 'bid': mblog['bid'], 'date': parse_date(mblog['created_at']), 'text': mblog['text']}
date = parse_date(mblog['created_at'])
mark = {'mid': mid, 'bid': mblog['bid'], 'date': date, 'text': mblog['text']}
amount += 1
if mid < limit[0]: exceed = True
if mid < limit[0] or mid > limit[1]: continue
if compare(limit[0], '>', [mid, date]): exceed = True
if compare(limit[0], '>', [mid, date]) or compare(limit[1], '<', [mid, date]): continue
if 'pics' in mblog:
for index, pic in enumerate(mblog['pics'], 1):
if 'large' in pic:
Expand Down Expand Up @@ -312,10 +324,12 @@ def download(url, path, overwrite):
boundary = args.boundary.split(':')
boundary = boundary * 2 if len(boundary) == 1 else boundary
numberify = lambda x: int(x) if re.search(r'^\d+$', x) else bid_to_mid(x)
dateify = lambda t: datetime.datetime.strptime(t, '@%Y%m%d').date()
parse_point = lambda p: dateify(p) if p.startswith('@') else numberify(p)
try:
boundary[0] = 0 if boundary[0] == '' else numberify(boundary[0])
boundary[1] = float('inf') if boundary[1] == '' else numberify(boundary[1])
assert boundary[0] <= boundary[1]
boundary[0] = 0 if boundary[0] == '' else parse_point(boundary[0])
boundary[1] = float('inf') if boundary[1] == '' else parse_point(boundary[1])
if type(boundary[0]) == type(boundary[1]): assert boundary[0] <= boundary[1]
except:
quit('invalid id range {}'.format(args.boundary))

Expand Down

0 comments on commit 4406fbc

Please sign in to comment.