Thinking Dataset v0.0.1: A Foundation for Strategic AI-Driven Business Intelligence with STaR Case Study Generation
LatestA Framework for Strategic Business Insights
Release v0.0.1 - Initial Release
Overview
The Thinking Dataset Project v0.0.1 introduces the foundational framework for generating strategic business insights and STaR (Situation, Task, Action, Result) case studies. This initial release sets up the basic infrastructure for analyzing complex strategic scenarios, ethical dilemmas, and decision-making processes.
Key Features
- Basic Pipeline Infrastructure: Initial implementation of data ingestion and preprocessing pipelines
- SQLite Database Setup: Basic data storage and management system
- CLI Tool: Essential command-line interface for basic dataset operations
- Initial Adapters: Basic support for Hugging Face and Ollama endpoints
- Core Documentation: Basic documentation covering installation and usage
Components
Core Features
- Data ingestion functionality
- Preprocessing pipeline
- Foundational case study format
- Model evaluation framework
- Adapter implementations
Technical Implementation
- Python 3.12+ support
- SQLite and SQLAlchemy integration
- Pipeline configuration
- Automatic environment setup
- Robust logging
Installation
Prerequisites
- Python 3.10 or later
- Git
- A cloud-based account (e.g., OpenAI) or a GPU (RTX 3090 or greater) for processing, or both
Setup
-
Clone the repository:
git clone https://github.com/MultiTonic/thinking-dataset.git cd thinking-dataset
-
Install
uv
package manager:First add the package into the global environment:
pip install uv
Then add uv tools directory to PATH*:
uv tool update-shell
-
Set up the project:
uv run setup
*You may need to restart your terminal session for the changes to update.
This will create a virtual environment, install the project dependencies, and activate the virtual environment.
-
Set up environment variables:
Copy the
.env.sample
file to.env
and change the values as needed:cp .env.sample .env
Update the
.env
file with your credentials:# Required settings HF_ORG="my_huggingface_organization" HF_USER="my_huggingface_username" HF_READ_TOKEN="my_huggingface_read_access_token" HF_WRITE_TOKEN="my_huggingface_write_access_token" # Required configuration CONFIG_PATH="config/config.yaml" # One or more providers OLLAMA_SERVER_URL="http://localhost:11434" OPENAI_API_TOKEN="your_openai_api_token" RUNPOD_API_TOKEN="your_runpod_api_token"
Breaking Changes
None (Initial Release)
Bug Fixes
- Initial release, no bug fixes to report
Dependencies
- Python 3.10+
- SQLite
- pandas
- scikit-learn
- rich
- python-dotenv
- Hugging Face Transformers
- Ollama
Setup
-
Clone the repository:
git clone https://github.com/MultiTonic/thinking-dataset.git cd thinking-dataset
-
Install
uv
package manager:First add the package into the global environment:
pip install uv
Then add uv tools directory to PATH*:
uv tool update-shell
-
Set up the project:
uv run setup
*You may need to restart your terminal session for the changes to update.
This will create a virtual environment, install the project dependencies, and activate the virtual environment.
-
Set up environment variables:
Copy the
.env.sample
file to.env
and change the values as needed:cp .env.sample .env
Update the
.env
file with your credentials:# Required settings HF_ORG="my_huggingface_organization" HF_USER="my_huggingface_username" HF_READ_TOKEN="my_huggingface_read_access_token" HF_WRITE_TOKEN="my_huggingface_write_access_token" # Required configuration CONFIG_PATH="config/config.yaml" # One or more providers OLLAMA_SERVER_URL="http://localhost:11434" OPENAI_API_TOKEN="your_openai_api_token" RUNPOD_API_TOKEN="your_runpod_api_token"
- Runpod
- Additional dependencies listed in
thinking-dataset.toml
Security Updates
- Initial security configurations implemented
- Basic authentication and authorization flows established
Documentation Updates
- Added initial project documentation
- Included installation guide
- Basic usage instructions
- Architecture overview
Known Issues
- Limited to text-based data processing in this release
- GPU support requires RTX 3090 or greater
- Some advanced features planned for future releases
- Can only think, reasoning coming in v0.02!
Upgrade Instructions
Initial release - no upgrade needed.
Contributors
Special thanks to our initial contributors:
- Kara Rawson (Lead Engineer)
- Joseph Pollack (Creator & Business Leader)
- MultiTonic Team
Support
For support and questions:
- Create an issue on GitHub
- Join our Discord
- Email: [email protected]
Version
- Release: v0.0.1
- Date: 2024-01-25
- Commit: 0716d8d (Initial commit)
What's Changed
- Kev With Code - 🏆 by @Josephrp in #3
- Added Dynamic Variables into our configuration file. by @p3nGu1nZz in #45
- Renamed Prepare to Process by @Daksh2000 in #64
New Contributors
- @Josephrp made their first contribution in #3
- @p3nGu1nZz made their first contribution in #45
Full Changelog: https://github.com/MultiTonic/thinking-dataset/commits/v0.0.1