A Workshop for Ornithologists and Ecologists
Duration: 4 hours
Goal: Learn to leverage Google Gemini’s free tier for data analysis in ecology and ornithology
What You’ll Learn:
Materials: gemini.google.com - no local software (like R or Python) required!
| Time | Topic |
|---|---|
| 0:00-0:30 | Introduction to AI & LLMs |
| 0:30-1:15 | Getting Started with Gemini |
| 1:15-2:00 | Prompt Engineering Basics |
| 2:00-2:15 | Break |
| 2:15-3:15 | Data Analysis with Gemini |
| 3:15-4:00 | Advanced Applications & Wrap-up |
Artificial Intelligence (AI): Computer systems that can perform tasks requiring human intelligence
Traditional Programming:
AI/Machine Learning:
Applications You Might Know:
These all use different types of machine learning!
Large Language Models (LLMs): AI systems trained on vast amounts of text data
Key Characteristics:
Examples: GPT-4, Claude, Google Gemini, Llama
Simplified Process:
Think of it as: A very sophisticated autocomplete that understands context
Google’s Advanced AI Model
Capabilities:
Your Workspace Benefits:
Available in Google Workspace:
For this workshop: You have access to both versions through your Workspace account!
Google’s AI Research Assistant
What is NotebookLM?
Perfect for Research!
Use Cases:
Literature Review: - Upload multiple research papers - Ask cross-document questions - Generate summaries - Extract key findings
Data Documentation: - Upload field notes (e.g., from BirdBox or CCFS sheets) - Query your observations - Connect related findings
Collaboration: - Share notebooks with team - Centralized knowledge base - Track research progress
Writing Support: - Synthesize information - Generate outlines - Cite sources automatically
When to Use Each:
| Gemini | NotebookLM |
|---|---|
| General questions | Source-specific research |
| Quick analysis | Deep literature review |
| Code generation | Document analysis |
| Brainstorming | Citation management |
| Real-time queries | Long-term projects |
Best Practice: Use both together for comprehensive research support!
You’re already set up!
Your Google Workspace account includes:
Today’s Focus: We’ll primarily use Gemini, with NotebookLM tips throughout
Main Components:
Pro Tip: Use separate chats for different projects/topics
What is a Prompt?
A prompt is the text you provide to the AI to get a response.
Example Prompts:
Key Point: Clear questions get clear answers!
Exercise (5 minutes):
What is a keystone species? Give me an example from
the San Francisco Bay salt marsh ecosystem.
Explain the difference between species richness and
species evenness in simple terms.
What are the main challenges in studying bird migration
patterns?
Notice: Response speed, detail level, accuracy
Share: What surprised you about the responses?
Research Support:
Data Analysis:
Writing Help:
Definition: The practice of designing effective prompts to get better AI responses
Why It Matters:
1. Be Specific
❌ “Tell me about birds” ✅ “Explain the nesting habits of California Least Terns at the Alameda Point colony”
2. Provide Context
❌ “Analyze this data” ✅ “Analyze this bird banding data from Coyote Creek Field Station. Columns: species, age, sex, wing_length”
3. Set Constraints
❌ “How do I analyze diversity?” ✅ “Explain Shannon diversity index in 3 paragraphs suitable for undergraduate ecology students”
4. Specify Format
❌ “Compare these species” ✅ “Create a comparison table for Snowy Plover and Killdeer, comparing nesting habitat, camouflage, and conservation status”
Role Assignment:
“You are an expert ornithologist. Explain…”
Step-by-Step Requests:
“Walk me through the steps to calculate…”
Examples in Prompts:
“Like this example: [provide example], now do this with…”
Follow-up Refinement:
“That’s good, but make it more concise”
❌ Too Vague: “Help with my research” ✅ Better: “Suggest statistical methods for comparing bird abundance across 5 habitat types”
❌ Assuming Too Much: “Use the standard method” ✅ Better: “Use Principal Component Analysis (PCA) to…”
❌ No Error Checking: Accepting first answer ✅ Better: “Verify this answer: Is this the correct formula?”
Exercise (10 minutes):
Part A: First, copy this poor prompt into Gemini:
Tell me about bird diversity analysis
Part B: Now copy this improved version:
I'm an ornithologist studying songbird communities in
California oak woodlands. Explain how to compare bird diversity
between riparian corridors and upland habitats. Include which
diversity metrics to use and why.
Part C: Create your own improved prompt for this vague question:
How do I analyze my data?
Your version should specify: data type, research question, sample size, and goal.
Bonus: Paste your improved prompts into Gemini and compare results!
Share: What made the difference in response quality?
For Ecology/Ornithology:
Always include:
Example: “I have 10 years of banding data for Common Yellowthroats at CCFS. Suggest appropriate methods to analyze temporal trends in body condition.”
What Gemini Can Help With:
What It Cannot Do:
*Advanced paid features may allow code execution
Typical Process:
Key: You’re still in control! Gemini is your assistant.
Your Question:
“I have bird survey data with columns: site_id, species, count, habitat_type. I want to compare species diversity between forest and grassland habitats. Suggest an analysis approach.”
Gemini Might Suggest:
Your Prompt:
“Write R code to calculate Shannon diversity index for each site from a dataframe called ‘bird_data’ with columns: site_id, species, count. Then compare diversity between habitats using a t-test.”
Gemini Will Generate:
Your Prompt:
“Write Python code using pandas and scikit-bio to calculate Shannon diversity and create a boxplot comparing forest vs grassland sites.”
Gemini Will Generate:
Pro Tip: Always specify libraries you prefer!
You Can Ask:
Example Exchange:
You: “I got a p-value of 0.23 comparing diversity between habitats. What does this mean?”
Gemini: “A p-value of 0.23 means there’s no statistically significant difference (typically we use α=0.05). This suggests…”
Best Practices:
Example:
“I have a CSV with these columns and first 3 rows:
species,count,site
COYE,5,CCFS_1
BHGR,3,CCFS_1
NUWO,2,CCFS_2
…”
Exercise (15 minutes):
Copy this complete scenario into Gemini:
I have bird survey data from 10 sites. Here are the results:
Forest sites:
- Site F1: 12 species, 45 total birds, Shannon diversity = 2.21
- Site F2: 15 species, 52 total birds, Shannon diversity = 2.48
- Site F3: 14 species, 48 total birds, Shannon diversity = 2.35
- Site F4: 13 species, 50 total birds, Shannon diversity = 2.28
- Site F5: 16 species, 55 total birds, Shannon diversity = 2.52
Grassland sites:
- Site G1: 8 species, 38 total birds, Shannon diversity = 1.85
- Site G2: 10 species, 41 total birds, Shannon diversity = 1.98
- Site G3: 9 species, 39 total birds, Shannon diversity = 1.91
- Site G4: 11 species, 43 total birds, Shannon diversity = 2.05
- Site G5: 7 species, 35 total birds, Shannon diversity = 1.76
Questions:
1. Calculate the mean Shannon diversity for each habitat type
2. Does there appear to be a difference between habitats?
3. Would a t-test be appropriate here? Why or why not?
4. What would be the null and alternative hypotheses?
5. What biological factors might explain any differences?
Then try follow-up questions: - “Write R code to perform this analysis” - “How would I visualize these results?” - “What if the data isn’t normally distributed?”
Statistical Tests:
Visualization:
Interpretation:
How Gemini Can Help:
Field Guides: “What are the key field marks to distinguish a Pacific-slope Flycatcher from a Cassin’s Vireo?”
Diagnostic Keys: “Create a dichotomous key for identifying local warblers”
Behavior Patterns: “Describe typical foraging behavior of Nuttall’s Woodpecker vs Downy Woodpecker in oak woodlands”
Note: Always verify with field guides and experts!
Effective Uses:
Summarizing Concepts: “Summarize the current understanding of island biogeography theory”
Finding Research Gaps: “What are the understudied aspects of Salt Marsh Common Yellowthroat ecology?”
Explaining Methods: “Explain how capture-recapture methods work for population estimation”
Comparing Approaches: “Compare radio telemetry vs GPS tracking for bird movement studies”
Brainstorming with Gemini:
Your Prompt: “I study Snowy Plover nesting on managed ponds in the South Bay. Suggest 5 testable hypotheses about how predator presence affects fledgling success.”
Gemini Might Suggest:
Your Role: Evaluate, refine, and test!
What to Ask:
Example:
“I want to test if bird feeders affect local bird diversity. Suggest an experimental design with proper controls.”
Paper Sections:
Methods: “Write a methods paragraph describing point count surveys conducted at 30 sites, 3 times each, during breeding season”
Results: “Describe these statistical results in clear prose: [paste results]”
Discussion: “Suggest possible explanations for why species richness decreased with urbanization intensity”
Important: Always personalize and verify! Don’t copy-paste directly.
Useful Prompts:
Remember:
Create Materials:
Citizen Science:
Choose ONE scenario (10 minutes) and copy it into Gemini:
Option A - Species Identification:
I observed a small songbird in a temperate forest with these
characteristics:
- Size: smaller than a robin
- Upperparts: olive-brown
- Underparts: white with brown streaking
- Behavior: foraging on the ground, scratching in leaf litter
- Song: loud, dry trill or "wick-wick-wick"
What species might this be? Provide a shortlist of possibilities
and key distinguishing features. I'm in the San Francisco Bay Area.
Option B - Research Design:
I want to study the effect of tidal restoration on marsh bird
nesting success. I have access to 10 restored sites and 10 reference
sites in the South Bay and 3 years of data.
Help me design this study:
1. How should I allocate my effort?
2. What variables should I measure?
3. What are potential confounding factors?
4. What statistical approach would I use?
5. What are the main limitations?
Option C - Writing Help:
Help me write a methods paragraph. I conducted point count
surveys at 30 sites (15 forest, 15 grassland). Each site was
surveyed 3 times between May 15 and June 30, 2024. Each survey
lasted 10 minutes. I recorded all birds seen or heard within
50m radius between 6am-10am on days with no rain and wind <15 km/h.
Write this as a clear methods paragraph for a scientific paper.
Share: What was helpful? What would you need to verify or change?
Motus Wildlife Tracking System: A collaborative network of automated radio telemetry stations.
Relevance to SFBBO: - Pacific Coast Motus Network: Tracking shorebirds and landbirds through the Bay. - Stations: CCFS (Coyote Creek), Palo Alto Baylands, Ravenswood, and more. - Data Challenge: Large SQLite files with complex relationships and potential false positives.
Scenario: You’ve downloaded your Motus project data. It’s a .motus SQLite file.
Prompt to Gemini:
"I have a Motus tracking dataset in SQLite format. Explain the relationship between the 'hits', 'runs', and 'tagdeps' tables. What columns should I look for to identify unique tags and their detection times?"
Why this helps: Gemini explains the complex relational structure without you needing to be a database expert!
Scenario: You have a small sample of your Motus data. You want Gemini to find the “bad” detections for you.
Copy into Gemini:
"Here is a sample of my Motus detections (StationID, TagID, Hits, SNR):
1. CCFS, 5678, 2, 1.5
2. CCFS, 5678, 15, 8.2
3. PABAY, 5678, 1, 0.5
4. PABAY, 5678, 12, 7.5
Which of these detections are likely 'false positives' based on low hit counts
or low signal-to-noise (SNR)? Explain your reasoning."
Why this helps: You learn to use Gemini as a filter before you even touch R!
Scenario: You want to understand the migration path without writing code.
Prompt to Gemini:
"I have three Motus detections for Tag 101:
- 08:00: CCFS (South Bay)
- 10:30: Hayward Shoreline (East Bay)
- 13:00: San Pablo Bay (North Bay)
Describe the likely movement path of this bird through the SF Bay.
What direction is it traveling? What is the approximate distance covered?"
Output: Gemini interprets the geography and timing for you!
Copy into Gemini:
"You are a data analyst helping SFBBO biologists. We have a tagged
Salt Marsh Common Yellowthroat detected at CCFS (Lat 37.4, Lon -121.9)
and then 2 hours later at Palo Alto Baylands (Lat 37.45, Lon -122.1).
1. Calculate the minimum distance traveled in km.
2. What was the minimum flight speed in km/h?
3. Write an R code snippet to calculate 'time since deployment'
for this tag given a deployment date of 2024-05-01."
Gemini Can Make Mistakes:
Always Verify:
Cross-Reference:
Critical Questions:
Use Gemini Itself:
“Is this statement accurate: [paste claim]? Provide sources.”
DO NOT Share:
Safe to Share:
Research Ethics:
Data Interpretation:
Rate Limits:
Features:
Workarounds:
Don’t Use For:
It’s a Tool, Not a Replacement:
Guidelines:
Acceptable: - Brainstorming ideas - Learning concepts - Debugging code - Improving writing clarity
Requires Disclosure: - Substantial text generation - Code generation (check journal policies) - Analysis suggestions used directly
Not Acceptable: - Fabricating data or results - Plagiarizing AI output as original - Bypassing learning in courses
Check your institution’s policies!
Gemini Can Process:
Images: - “What species is in this photo?” - “Describe the habitat in this image” - “Read this field data sheet”
Combinations: - Upload image + ask questions - Combine data tables with text - Analyze charts/graphs
Try it: Upload a bird photo and ask for ID help!
Practical Workflow Example:
Step 1: Create a Notebook for Your Project - Go to notebooklm.google.com - Create “Songbird Diversity Project”
Step 2: Upload Your Sources - Research papers on bird diversity - Your field notes (Google Docs) - Previous study results (PDFs) - Methodology references
Step 3: Ask Cross-Document Questions
Copy these into NotebookLM (after uploading sources):
Summarize the main findings about forest bird diversity
across all uploaded papers. What are the common themes?
Based on my field notes and the literature, what factors
might explain the patterns I'm seeing in my data?
Create a comparison table of the statistical methods used
in these studies for analyzing bird diversity.
Generate a study guide covering the key concepts I need
to understand for my analysis.
Powerful Workflow:
Key Capabilities:
Pro Tip: Create separate notebooks for different projects or research phases
Build on Previous Responses:
Advantage: Gemini remembers conversation context!
Reusable Prompts:
Save effective prompts for common tasks:
Statistical Analysis Template: “I have data on [VARIABLE] from [STUDY_SYSTEM] with [SAMPLE_SIZE] samples. I want to test [HYPOTHESIS]. Suggest appropriate statistical approach and R code.”
Literature Summary Template: “Summarize current understanding of [TOPIC] in [FIELD], focusing on [ASPECT]. Include key studies and knowledge gaps.”
Daily Research Tasks:
Project Phases:
Workflow Integration:
NotebookLM + Gemini: - Upload papers to NotebookLM → Get summaries → Use insights in Gemini for analysis
Gemini + R/Python: - Generate code → Test in IDE → Refine with Gemini
NotebookLM + Literature: - Upload papers to NotebookLM → Ask cross-paper questions → Generate literature review
Gemini + Collaboration: - Draft text → Team review → Refine with Gemini
NotebookLM + Field Work: - Upload field notes → Query observations → Track patterns over time
All Together: - Best when combined with textbooks, courses, experts, and traditional methods
Google AI Tools Evolve:
Follow:
Google AI Tools:
Prompt Engineering:
AI in Science:
Communities:
Explore on Your Own:
Documentation: ai.google.dev
Individual Activity (5 minutes):
Think about your current research/work and write a detailed prompt in your notes or directly in Gemini:
Template to use:
[Describe your research context]
I want to [specific goal]
I have [data/resources available]
Help me:
1. [Question 1]
2. [Question 2]
3. [Question 3]
Example:
I study warbler populations in mixed hardwood forests. I want to
determine if restoration age (measured by years since levee breach)
affects Ridgeway's Rail occupancy. I have 15 sites surveyed over
5 years with call-back survey results and vegetation data.
Help me:
1. Design the statistical analysis
2. Identify potential confounding variables
3. Suggest appropriate visualizations
Optional: Test your prompt in Gemini and share results with the group!
✅ AI is a powerful assistant, not a replacement for expertise
✅ Good prompts = Better results (be specific, provide context)
✅ Always verify AI outputs against trusted sources
✅ Free tier is capable for most learning and research tasks
✅ Ethical use matters - maintain integrity and privacy
✅ Start small - Integrate gradually into workflow
This Week:
This Month:
Ongoing:
Open Q&A Time
Stay Connected:
Websites:
Example Use Cases:
Further Learning:
Remember:
Questions? Contact [your email]
Materials: [Repository link]
Quick Prompts for Common Tasks:
| Task | Prompt Template |
|---|---|
| Statistical Test | “I have [DATA TYPE] measuring [VARIABLES] with [N] samples. Suggest appropriate statistical test.” |
| R Code | “Write R code using [PACKAGES] to [TASK] with data structure: [DESCRIPTION]” |
| Interpret Result | “I got [RESULT] from [TEST]. Explain what this means for [HYPOTHESIS].” |
| Literature Help | “Summarize current understanding of [TOPIC] in [FIELD], focusing on [ASPECT].” |
| Species ID | “What distinguishes [SPECIES A] from [SPECIES B]? Focus on [FIELD MARKS/BEHAVIOR].” |
| Visualization | “Create [PLOT TYPE] in [R/Python] showing [RELATIONSHIP] with [AESTHETICS].” |
Save these for quick reference!