Skip to content

Latest commit

 

History

History
25 lines (20 loc) · 2.49 KB

README.md

File metadata and controls

25 lines (20 loc) · 2.49 KB

Miscellaneous projects and scripts.

Author and contact: [email protected]

Performance Engineering and Apache Spark

Folder Description
Spark Dashboard A tool for Apache monitoring, use to build a performance dashboard and troubleshoot Spark jobs.
Spark Notes Miscellaneous tips and code snippets about Apache Spark.
Spark for Physics Examples, with code and data of using Apache Spark for High Energy Physics data analysis.
Performance Testing Code and examples, includes:
- A tool to run TPCDS at scale with PySpark and collect execution metrics
- Tools for load-testing CPUs in Python and Rust
- Notes on how to use various tools for performance investigations

Data Engineering and Data Science

Folder Description
Deep Learning Notes Notes and examples on Deep Learning tools and related data pipelines.
Pyspark_SQL_Magic_Jupyter How to write Jupyter SQL magic functions for PySpark and Spark SQL.
Trino and Presto on Jupyter Example of using Trino or Presto on a Jupyter notebook.
PostgreSQL and YugabyteDB on Jupyter Example of using PostgreSQL or YugabyteDB on a Jupyter notebook.
Oracle_Jupyter Examples of how to query Oracle using Jupyter/IPython notebooks.
Impala_SQL_Jupyter Examples of how to run SQL on Apache Impala using Jupyter/IPython notebooks.
SQL_color_Mandelbrot How to use SQL to compute and display the Mandelbrot set with colors. Examples for Oracle and PostgreSQL.
PLSQL_Neural_Network An example of neural network inference using Oracle RDBMS and PL/SQL.