Margin kernels run Python 3.11 with a curated set of data science libraries pre-installed. You can also install additional packages within your session.
Every kernel starts with these libraries ready to import:
| Library | Version | Import |
|---|---|---|
| pandas | 2.2+ | import pandas as pd |
| numpy | 1.26+ | import numpy as np |
| polars | 0.20+ | import polars as pl |
| Library | Version | Import |
|---|---|---|
| matplotlib | 3.8+ | import matplotlib.pyplot as plt |
| seaborn | 0.13+ | import seaborn as sns |
| plotly | 5.18+ | import plotly.express as px |
| altair | 5.2+ | import altair as alt |
| Library | Version | Import |
|---|---|---|
| scikit-learn | 1.4+ | from sklearn import ... |
| statsmodels | 0.14+ | import statsmodels.api as sm |
| scipy | 1.12+ | import scipy |
| Library | Version | Import |
|---|---|---|
| requests | 2.31+ | import requests |
| beautifulsoup4 | 4.12+ | from bs4 import BeautifulSoup |
| openpyxl | 3.1+ | import openpyxl |
| pyarrow | 15+ | import pyarrow |
| Library | Description |
|---|---|
| margin | Load datasets from your workspace |
Need something else? Install packages within your session:
# Install a package
!pip install package-name
# Install a specific version
!pip install package-name==1.2.3
# Install multiple packages
!pip install package1 package2 package3
# NLP
!pip install nltk spacy transformers
# Time series
!pip install prophet pmdarima
# Geographic
!pip install geopandas folium
# Financial
!pip install yfinance pandas-ta
For large files, use efficient formats and lazy loading:
# Parquet is faster than CSV for large files
df = pd.read_parquet('large_file.parquet')
# Or use Polars for even better performance
import polars as pl
df = pl.read_parquet('large_file.parquet')
# Chunked reading for huge CSVs
chunks = pd.read_csv('huge.csv', chunksize=10000)
for chunk in chunks:
process(chunk)
Kernels have memory limits. Keep your session healthy:
# Check memory usage
import sys
print(f"DataFrame size: {df.memory_usage(deep=True).sum() / 1e6:.1f} MB")
# Free memory by deleting large objects
del large_dataframe
import gc
gc.collect()
# Use efficient dtypes
df['category_col'] = df['category_col'].astype('category')
df['small_int'] = df['small_int'].astype('int8')
Configure how outputs render:
# Pandas display options
pd.set_option('display.max_columns', 50)
pd.set_option('display.max_rows', 100)
pd.set_option('display.float_format', '{:.2f}'.format)
# Matplotlib defaults
plt.rcParams['figure.figsize'] = [10, 6]
plt.rcParams['figure.dpi'] = 100
# Plotly defaults (for inline display)
import plotly.io as pio
pio.renderers.default = 'notebook'
Understanding the kernel lifecycle helps avoid surprises:
| State | What's Happening |
|---|---|
| Connecting | Establishing WebSocket connection to the kernel server |
| Warming up | Python environment initializing, loading packages. Cells you run will queue. |
| Connected | Ready to execute. All pre-installed packages available, plus your installed packages and variables. |
| Restarted | Fresh environment. Pre-installed packages only—reinstall custom packages, re-run setup cells. |
| Disconnected | Nothing running. Reconnect to get a fresh environment. |
Kernels have built-in limits to keep your session secure:
| Limit | Value | Description |
|---|---|---|
| Memory | ~800 MB | Per-kernel RAM limit |
| Execution timeout | 5 minutes | Max time for a single cell to run |
| Idle timeout | 10 minutes | Disconnects after inactivity (auto-reconnects when you run code) |
If your code hits the execution timeout, the kernel terminates and you'll need to reconnect. For long-running computations, consider breaking them into smaller chunks or using more efficient algorithms.
!pip install commands in their own cell