Skip to content

Commit

Permalink
initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
Péter Volf committed Sep 29, 2017
0 parents commit 3cf9ab4
Show file tree
Hide file tree
Showing 9 changed files with 1,914 additions and 0 deletions.
21 changes: 21 additions & 0 deletions LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) [year] [fullname]

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
34 changes: 34 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
GraphScraper
=================

GraphScraper is a Python 3 library that contains a base graph implementation designed
to be turned into a web scraper for graph data. It has two major features:
1) The graph automatically manages a database (using either SQLAlchemy or
Flask-SQLAlchemy) where it stores all the nodes and edges the graph has seen.
2) The base graph implementation provides hook methods that, if implemented,
turn the graph into a web scraper.

Demo - igraph
------------------

Besides the actual graph implementation, a working demo using the igraph_ library
is also included that shows how you can implement and use an actual graph-scraper.
Instead of actual web-scraping, this demo uses igraph graph instance as the "remote"
source to scrape data from.

Dependencies
-----------------

The project requires SQLAlchemy_ or Flask-SQLAlchemy_ to be installed.
If you wish to the included igraph-based graph implementation, you will also need
igraph_ library.

Contribution
-----------------

Any form of constructive contribution (feedback, features, bug fixes, tests, additional
documentation, etc.) is welcome.

.. _Flask-SQLAlchemy: http://flask-sqlalchemy.pocoo.org/
.. _igraph: http://igraph.org
.. _SQLAlchemy: https://www.sqlalchemy.org/
37 changes: 37 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
from codecs import open
from os import path
from setuptools import setup, find_packages

# Get the long description from the README file
with open(path.join(path.abspath(path.dirname(__file__)), 'README.rst'), encoding='utf-8') as f:
long_description = f.read()

setup(
name="graphscraper",
version="0.1.0",
description="Graph implementation that loads graph data (nodes and edges) from external sources "
"and caches the loaded data in a database using sqlalchemy or flask-sqlalchemy.",
long_description=long_description,
url="https://github.com/volfpeter/graphscraper",
author="Peter Volf",
author_email="[email protected]",
license="MIT",
classifiers=[
"Development Status :: 4 - Beta",
"Intended Audience :: Developers",
"Intended Audience :: Information Technology",
"Intended Audience :: Science/Research",
"License :: OSI Approved :: MIT License",
"Natural Language :: English",
"Programming Language :: Python :: 3",
"Topic :: Database",
"Topic :: Internet",
"Topic :: Scientific/Engineering",
"Topic :: Software Development :: Libraries",
"Topic :: Utilities"
],
keywords="graph network webscraping sqlalchemy database db caching",
package_dir={"": "src"},
packages=find_packages("src"),
python_requires=">=3"
)
23 changes: 23 additions & 0 deletions src/graphscraper/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
"""
The root package of the graphscraper project.
"""

# Imports
# ------------------------------------------------------------

# Expose the core modules for ease of use.
from graphscraper import base
from graphscraper import db

# No need to expose the eventdispatcher module. It won't be needed by the user.
# from graphscraper import eventdispatcher

# Do not import the rest of the modules, because they have dependencies that
# might not be available on the user's machine.
# from graphscraper import demo
# from graphscraper import igraphwrapper

# Module constants
# ------------------------------------------------------------

__author__ = 'Peter Volf'
Loading

0 comments on commit 3cf9ab4

Please sign in to comment.