Skip to content

Instantly share code, notes, and snippets.

View James-Rocker's full-sized avatar
🏠
Working from home

James Rocker James-Rocker

🏠
Working from home
View GitHub Profile
@James-Rocker
James-Rocker / pandera_vs_pydantic.py
Created December 26, 2024 20:06
Comparing Pandera vs Pydantic
import pandas as pd
import time
from pydantic import BaseModel, ValidationError, Field
import pandera as pa
from pandera import Column, DataFrameSchema
# Generate a synthetic DataFrame
def generate_data(n: int) -> pd.DataFrame:
data = {
@James-Rocker
James-Rocker / schema_validation_comparison.py
Created September 17, 2024 12:44
pydantic vs pandera performance while working with dataframes
import pandas as pd
import time
from pydantic import BaseModel, ValidationError, Field
import pandera as pa
from pandera import Column, DataFrameSchema
# Generate a synthetic DataFrame
def generate_data(n: int) -> pd.DataFrame:
data = {
@James-Rocker
James-Rocker / basic.py
Last active July 9, 2024 12:16
Why use mypy - CSC
# Without type hints
def add(a, b):
return a + b
# Calling the function
result = add(2, 3)
# With type hints
import pandas as pd
import polars as pl
import time
# Generate a sample CSV file for the test
data = {
"col1": range(1, 1000001),
"col2": range(1000000, 0, -1),
"col3": ["text"] * 1000000
}
@James-Rocker
James-Rocker / gist:174090010f2764b8c31ee30841c2b1e8
Last active April 10, 2022 14:20
Data Analyst Career Paths
SQL Developer. Good money. Can work full time or as your own consultant business. Writes SQL code.
BI Developer. Better money. Can work full time or as your own consultant business. Writes whatever code is needed to produce the required report. Can get better money by also advising on what are good and bad metrics for a dashboard, but not advising on what to the company should do with the numbers you present.
Report writer would be lower pay and just work as an employee. You are given the data and the required output and you just make the dashboard.
Maybe an ETL / ELT developer. Good to better money. Can work full time or as your own consultant business. Moving data from one system to another. Either one time moves, ongoing scheduled jobs, or real time. Lots of SQL code, but also using other tools to transform and move data.
Data Engineer. Top pay. Requires good understanding of many database systems, data tools, and programming languages. Not necessarily being fluent in anything, but basic knowledge on h