π Data Science Riddle
Why do we use Batch Normalization?
Why do we use Batch Normalization?
Anonymous Quiz
32%
Speeds up training
40%
Prevents overfitting
8%
Adds non-linearity
20%
Reduces dataset size
β€3
π Data Science Riddle
Your object detection model misses small objects. Easiest fix?
Your object detection model misses small objects. Easiest fix?
Anonymous Quiz
24%
Use larger input images
27%
Add more classes
35%
Reduce learning rate
14%
Train longer
π€ AI that creates AI: ASI-ARCH finds 106 new SOTA architectures
ASI-ARCH β experimental ASI that autonomously researches and designs neural nets. It hypothesizes, codes, trains & tests models.
π‘ Scale:
1,773 experiments β 20,000+ GPU-hours.
Stage 1 (20M params, 1B tokens): 1,350 candidates beat DeltaNet.
Stage 2 (340M params): 400 models β 106 SOTA winners.
Top 5 trained on 15B tokens vs Mamba2 & Gated DeltaNet.
π Results:
PathGateFusionNet: 48.51 avg (Mamba2: 47.84, Gated DeltaNet: 47.32).
BoolQ: 60.58 vs 60.12 (Gated DeltaNet).
Consistent gains across tasks.
π Insights:
Prefers proven tools (gating, convs), refines them iteratively.
Ideas come from: 51.7% literature, 38.2% self-analysis, 10.1% originality.
SOTA share: self-analysis β to 44.8%, literature β to 48.6%.
@datascience_bds
ASI-ARCH β experimental ASI that autonomously researches and designs neural nets. It hypothesizes, codes, trains & tests models.
π‘ Scale:
1,773 experiments β 20,000+ GPU-hours.
Stage 1 (20M params, 1B tokens): 1,350 candidates beat DeltaNet.
Stage 2 (340M params): 400 models β 106 SOTA winners.
Top 5 trained on 15B tokens vs Mamba2 & Gated DeltaNet.
π Results:
PathGateFusionNet: 48.51 avg (Mamba2: 47.84, Gated DeltaNet: 47.32).
BoolQ: 60.58 vs 60.12 (Gated DeltaNet).
Consistent gains across tasks.
π Insights:
Prefers proven tools (gating, convs), refines them iteratively.
Ideas come from: 51.7% literature, 38.2% self-analysis, 10.1% originality.
SOTA share: self-analysis β to 44.8%, literature β to 48.6%.
@datascience_bds
β€3
π Databricks Tip: REPLACE vs MERGE
When updating Delta tables, youβve got two powerful options:
πΉ REPLACE TABLE β¦ ON
π Like throwing away the entire library and rebuilding it.
- Drops the old table & recreates it.
- Schema + data = fully replaced.
- β‘ Super fast but destructive (old data gone).
- β Best for full refreshes or schema changes.
πΉ MERGE
π Like updating only the books that changed.
- Works row by row.
- Updates, inserts, or deletes specific records.
- π Preserves unchanged data.
- β Best for incremental updates or CDC (Change Data Capture).
βοΈ Key Difference
- REPLACE = Start fresh with a new table.
- MERGE = Surgically update rows without losing the rest.
π Rule of thumb:
Use REPLACE for full rebuilds,
Use MERGE for incremental upserts.
#Databricks #DeltaLake
When updating Delta tables, youβve got two powerful options:
πΉ REPLACE TABLE β¦ ON
π Like throwing away the entire library and rebuilding it.
- Drops the old table & recreates it.
- Schema + data = fully replaced.
- β‘ Super fast but destructive (old data gone).
- β Best for full refreshes or schema changes.
πΉ MERGE
π Like updating only the books that changed.
- Works row by row.
- Updates, inserts, or deletes specific records.
- π Preserves unchanged data.
- β Best for incremental updates or CDC (Change Data Capture).
βοΈ Key Difference
- REPLACE = Start fresh with a new table.
- MERGE = Surgically update rows without losing the rest.
π Rule of thumb:
Use REPLACE for full rebuilds,
Use MERGE for incremental upserts.
#Databricks #DeltaLake
β€1