GitHub - VastSea0/html-engine: Rust HTML engine

HTML Engine (Rust-based Experimental Browser)

This project is an experimental HTML engine built with Rust, combining web scraping, file system manipulation, and a graphical user interface (GUI) using GTK. It demonstrates how to fetch and parse HTML content from a webpage, extract specific elements, and display them in a GTK-based window.

Features

Web Scraping with Reqwest: Fetches HTML content from a URL using the reqwest library.
HTML Parsing: Extracts specific HTML tags (<h1>, <h2>, <h3>, <h4>, <h5>, and <p>) and stores their content.
File Handling with OpenOptions: Saves the fetched HTML to a file and allows further reading and writing operations.
GTK Integration for GUI: Displays the parsed content in a user-friendly GTK window, with labels for each HTML tag.

How it Works

Fetching HTML Content: The HTML content of a given URL is retrieved using reqwest. This is done through the get_html_content function, which handles HTTP requests and returns the HTML as a String.
Saving and Reading HTML: The fs function saves the HTML content to a local file (helo.html). This allows the engine to reuse the data for future parsing and display.
Parsing HTML Tags: The parser function scans the HTML file, extracting the contents of specific tags like h1, h2, and p. These elements are then stored in a Vec<String> and displayed in the GUI.
Displaying Content with GTK: The GTK-based GUI displays the URL, HTML content, and parsed elements in a vertical layout. Each element (e.g., h1 or p) is displayed as a separate label within the window.

Installation

To build and run this project, you'll need:

Rust toolchain (install via rustup)
GTK 3 development libraries
reqwest crate for web scraping
gtk crate for the GUI

Install dependencies:

cargo install reqwest
cargo install gtk

Clone the repository and run the application:

git clone https://github.com/vastsea0/html-engine
cd html-engine
cargo run

Usage

Modify the URL in the main() function to the desired webpage.
Run the application. The HTML content will be fetched, saved to helo.html, and parsed.
A GTK window will open, showing the URL and parsed HTML tags (h1, h2, etc.).

Example Output

When pointing to a URL like https://html5doctor.com/, the engine will extract and display all h1, h2, and paragraph tags in a GTK window.

Future Improvements

CSS Parsing and Rendering: Add support for basic CSS styles to enhance rendering capabilities.
Mini JavaScript Handling: Evaluate JavaScript within the HTML for more dynamic content.
Improved Parsing: Expand the parser to handle more complex HTML structures.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.idea		.idea
src		src
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HTML Engine (Rust-based Experimental Browser)

Features

How it Works

Installation

Usage

Example Output

Future Improvements

License

About

Releases

Packages

Languages

License

VastSea0/html-engine

Folders and files

Latest commit

History

Repository files navigation

HTML Engine (Rust-based Experimental Browser)

Features

How it Works

Installation

Usage

Example Output

Future Improvements

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages