-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
* chore(cargo): update redis-macros requirement from 0.2.1 to 0.3.0 Updates the requirements on [redis-macros](https://github.com/daniel7grant/redis-macros) to permit the latest version. - [Release notes](https://github.com/daniel7grant/redis-macros/releases) - [Changelog](https://github.com/daniel7grant/redis-macros/blob/master/Changelog.md) - [Commits](daniel7grant/redis-macros@v0.2.1...v0.3.0) --- updated-dependencies: - dependency-name: redis-macros dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * ci(actions): bump actions/checkout from 2 to 4 Bumps [actions/checkout](https://github.com/actions/checkout) from 2 to 4. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v2...v4) --- updated-dependencies: - dependency-name: actions/checkout dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> * fix `src/model/item.rs` (#92) * Moved vinted-db-feeder as submodule * Fix/advanced item test (#98) * Updated item JSON * Upgraded testing quality * Storing test results in /output * Added more new JSON fields * Using lazy_static for logging * Updated namespace for debug * Removed lazy initialization * Fixed clippy * Added new SerdeError * fixed clippy * Bump v0.9.0 * chore(cargo): update redis requirement from 0.24.0 to 0.25.4 (#93) Updates the requirements on [redis](https://github.com/redis-rs/redis-rs) to permit the latest version. - [Release notes](https://github.com/redis-rs/redis-rs/releases) - [Commits](redis-rs/redis-rs@redis-0.24.0...redis-0.25.4) --- updated-dependencies: - dependency-name: redis dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Álvaro Cabo Ciudad <[email protected]> * chore(cargo): update typed-builder requirement from 0.18 to 0.19 (#96) --- updated-dependencies: - dependency-name: typed-builder dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Álvaro Cabo Ciudad <[email protected]> * Updated CHANGELOG * chore(cargo): update reqwest_cookie_store requirement from 0.6 to 0.8 (#94) * chore(cargo): update reqwest_cookie_store requirement from 0.6 to 0.8 Updates the requirements on [reqwest_cookie_store](https://github.com/pfernie/reqwest_cookie_store) to permit the latest version. - [Changelog](https://github.com/pfernie/reqwest_cookie_store/blob/main/CHANGELOG.md) - [Commits](pfernie/reqwest_cookie_store@v0.6.0...v0.8.0) --- updated-dependencies: - dependency-name: reqwest_cookie_store dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Fix reqwest version --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Álvaro Cabo Ciudad <[email protected]> * fix method naming * Removed unnecessary @staticmethod tag --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Baptiste Zahnow <[email protected]>
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,3 +3,5 @@ Cargo.lock | |
.vscode/ | ||
/**/results/ | ||
docker/query.sh | ||
|
||
src/tests/output |
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
data/* | ||
!data/raw | ||
|
||
pyvenv | ||
*/__pycache__ |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# Vinted data scrapper | ||
|
||
A series of data scrappers in Python that extract static information from the Vinted source code, being the perfect complement for a Vinted API wrapper. | ||
|
||
## Install & Run | ||
|
||
### Requirements | ||
|
||
- Pip | ||
- Python | ||
|
||
1. Create a virtual environment | ||
|
||
```bash | ||
python3 -m venv pyvenv/ | ||
source venv/bin/activate | ||
``` | ||
|
||
2. Run `requirements.txt` | ||
|
||
```bash | ||
pip3 install -r requirements.txt | ||
``` | ||
|
||
3. Run `main.py` | ||
|
||
## Extracted data categories | ||
|
||
| Element | Fields Returned | | ||
| ------------------- | --------------------------------------------------- | | ||
| Brands | Names, Ids | | ||
| Materials | Id, Name | | ||
| Colors | Id, Color, Hex Code | | ||
| Sizes | Id, Title, Size_Type, Category_id | | ||
| Categories | Id, Title, Code, Parent Id, URL, URL EN, Item Count | | ||
| Categories Children | Category Id, Child Id | | ||
| Countries | Id, French_name, local_name, ISO_code, flag_emoji | | ||
|
||
### Materials and sizes | ||
|
||
- Available languages: 🇪🇸 🇫🇷 🇺🇸 | ||
- [More languages can be added if html file included in `data/raw/materials` or `data/raw/sizes`] | ||
|
||
### Categories - Catalogs | ||
|
||
- **Debug mode:** Builds the full decision tree | ||
|
||
- **Exec mode:** Returns 2 CSVs: | ||
|
||
- `categories.csv`: Table of all the available categories and their attributes | ||
|
||
- `categories_children.csv`: Dictionary that models the Category->Children list relationship | ||
|
||
## Performance | ||
|
||
- Without brands search: | ||
|
||
```bash | ||
real 0m1,941s | ||
user 0m1,225s | ||
sys 0m0,037s | ||
``` | ||
|
||
- Brands validation process: | ||
|
||
```bash | ||
real 14m14,211s | ||
user 0m19,753s | ||
sys 0m1,229s | ||
``` | ||
|
||
## Authors | ||
|
||
[Álvaro Cabo](https://github.com/alvarocabo) | ||
|
||
[Pepe Márquez](https://github.com/pxp9) |
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.