Python Environment
What's pre-installed in Margin kernels and how to add packages.
Margin kernels run Python 3.11 with a curated set of data science libraries pre-installed. You can also install additional packages within your session.
Pre-installed Libraries
Every kernel starts with these libraries ready to import:
Data Manipulation
| Library | Version | Import |
|---|---|---|
| pandas | 2.2+ | import pandas as pd |
| numpy | 1.26+ | import numpy as np |
| polars | 0.20+ | import polars as pl |
Visualization
| Library | Version | Import |
|---|---|---|
| matplotlib | 3.8+ | import matplotlib.pyplot as plt |
| seaborn | 0.13+ | import seaborn as sns |
| plotly | 5.18+ | import plotly.express as px |
| altair | 5.2+ | import altair as alt |
Machine Learning
| Library | Version | Import |
|---|---|---|
| scikit-learn | 1.4+ | from sklearn import ... |
| statsmodels | 0.14+ | import statsmodels.api as sm |
| scipy | 1.12+ | import scipy |
Utilities
| Library | Version | Import |
|---|---|---|
| requests | 2.31+ | import requests |
| beautifulsoup4 | 4.12+ | from bs4 import BeautifulSoup |
| openpyxl | 3.1+ | import openpyxl |
| pyarrow | 15+ | import pyarrow |
Margin Tools
| Library | Description |
|---|---|
| margin | Load datasets from your workspace |
Installing Additional Packages
Need something else? Install packages within your session:
# Install a package
!pip install package-name
# Install a specific version
!pip install package-name==1.2.3
# Install multiple packages
!pip install package1 package2 package3
Installed packages persist for the duration of your kernel session. When you disconnect and reconnect, you'll need to reinstall them.
Common Additions
# NLP
!pip install nltk spacy transformers
# Time series
!pip install prophet pmdarima
# Geographic
!pip install geopandas folium
# Financial
!pip install yfinance pandas-ta
Working with Large Datasets
For large files, use efficient formats and lazy loading:
# Parquet is faster than CSV for large files
df = pd.read_parquet('large_file.parquet')
# Or use Polars for even better performance
import polars as pl
df = pl.read_parquet('large_file.parquet')
# Chunked reading for huge CSVs
chunks = pd.read_csv('huge.csv', chunksize=10000)
for chunk in chunks:
process(chunk)
Memory Management
Kernels have memory limits. Keep your session healthy:
# Check memory usage
import sys
print(f"DataFrame size: {df.memory_usage(deep=True).sum() / 1e6:.1f} MB")
# Free memory by deleting large objects
del large_dataframe
import gc
gc.collect()
# Use efficient dtypes
df['category_col'] = df['category_col'].astype('category')
df['small_int'] = df['small_int'].astype('int8')
Display Settings
Configure how outputs render:
# Pandas display options
pd.set_option('display.max_columns', 50)
pd.set_option('display.max_rows', 100)
pd.set_option('display.float_format', '{:.2f}'.format)
# Matplotlib defaults
plt.rcParams['figure.figsize'] = [10, 6]
plt.rcParams['figure.dpi'] = 100
# Plotly defaults (for inline display)
import plotly.io as pio
pio.renderers.default = 'notebook'
Kernel Lifecycle
Understanding the kernel lifecycle helps avoid surprises:
| State | What's Happening |
|---|---|
| Connecting | Establishing WebSocket connection to the kernel server |
| Warming up | Python environment initializing, loading packages. Cells you run will queue. |
| Connected | Ready to execute. All pre-installed packages available, plus your installed packages and variables. |
| Restarted | Fresh environment. Pre-installed packages only—reinstall custom packages, re-run setup cells. |
| Disconnected | Nothing running. Reconnect to get a fresh environment. |
If you run cells while connecting or warming up, they queue and execute automatically once the kernel is ready.
Resource Limits
Kernels have built-in limits to keep your session secure:
| Limit | Value | Description |
|---|---|---|
| Memory | ~800 MB | Per-kernel RAM limit |
| Execution timeout | 5 minutes | Max time for a single cell to run |
| Idle timeout | 10 minutes | Disconnects after inactivity (auto-reconnects when you run code) |
If your code hits the execution timeout, the kernel terminates and you'll need to reconnect. For long-running computations, consider breaking them into smaller chunks or using more efficient algorithms.
Put your imports and pip installs in the first cell. Run it after connecting or restarting.
Tips for Smooth Sessions
- Group imports at the top – Easy to re-run after restart
- Use requirements cells – Put
!pip installcommands in their own cell - Clear outputs before sharing – Reduces notebook file size
- Restart when things get weird – Memory leaks happen; fresh start helps
Next Steps
- Load your datasets with the margin library
- Master keyboard shortcuts for faster work
- Create briefs from your outputs