38e7c284e8
- Updated `.gitignore` to streamline ignored files and added logging for common sites. - Expanded `config.py` with new configurations for Weixin and Redis, and improved database connection settings. - Refined `README.md` to clarify project structure and usage instructions. - Enhanced `requirements.txt` with additional dependencies for MongoDB and Redis support. - Refactored multiple spider scripts to utilize a session-based approach for HTTP requests, improving error handling and proxy management. - Updated `export_lawyers_excel.py` to include a default timestamp for data exports.
621 B
621 B
lawyers-common-sites
从 /www/wwwroot/lawyer 中抽离出的 common_sites 独立项目。
目录
common_sites/: 站点采集脚本request/: 代理配置utils/: 公共工具Db.py: 数据库封装config.py: 项目配置
快速启动
cd /www/wwwroot/lawyers
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
bash common_sites/start.sh
说明
- 当前项目直接复用原项目数据库配置和代理配置。
- 采集依赖原库中的
lawyer、area_new、area、area2等表。 - 日志默认输出到
common_sites/*.log。