noodlbox

Cypher Queries

Guide to writing Cypher queries for noodlbox

Cypher is a declarative graph query language used to query the noodlbox knowledge graph. This guide covers common patterns and best practices.

Basics

Node Matching

Match nodes by label:

MATCH (s:CodeSymbol)
RETURN s.name

Match with properties:

MATCH (s:CodeSymbol {name: "authenticate"})
RETURN s

Match with WHERE clause:

MATCH (s:CodeSymbol)
WHERE s.symbol_type = "Function"
RETURN s.name

Relationship Matching

Basic relationship:

MATCH (a:CodeSymbol)-[:CALLS]->(b:CodeSymbol)
RETURN a.name, b.name

With direction:

// Outgoing calls
MATCH (a)-[:CALLS]->(b) RETURN a, b

// Incoming calls
MATCH (a)<-[:CALLS]-(b) RETURN a, b

Variable-length Paths

Match paths of varying length:

// 1 to 3 hops
MATCH path = (a)-[:CALLS*1..3]->(b)
RETURN path

// Any length
MATCH path = (a)-[:CALLS*]->(b)
RETURN path

Common Patterns

Find All Callers of a Function

MATCH (caller:CodeSymbol)-[:CALLS]->(target:CodeSymbol)
WHERE target.name = "authenticateUser"
RETURN caller.name, caller.file_path

Find All Functions a Function Calls

MATCH (source:CodeSymbol)-[:CALLS]->(callee:CodeSymbol)
WHERE source.name = "handleRequest"
RETURN callee.name, callee.file_path

Find Call Chain

MATCH path = (start:CodeSymbol)-[:CALLS*1..5]->(end:CodeSymbol)
WHERE start.name = "main" AND end.name = "saveToDatabase"
RETURN path

Find Functions in a File

MATCH (s:CodeSymbol)-[:CONTAINED_BY]->(f:File)
WHERE f.path = "src/auth/authenticate.ts"
RETURN s.name, s.symbol_type, s.line_number
ORDER BY s.line_number

Find Community Members

MATCH (s:CodeSymbol)-[:MEMBER_OF]->(c:Community)
WHERE c.label = "Authentication"
RETURN s.name, s.symbol_type

Find Cross-Community Calls

MATCH (s1:CodeSymbol)-[:MEMBER_OF]->(c1:Community),
      (s2:CodeSymbol)-[:MEMBER_OF]->(c2:Community),
      (s1)-[:CALLS]->(s2)
WHERE c1 <> c2
RETURN c1.label AS from_community,
       c2.label AS to_community,
       s1.name AS caller,
       s2.name AS callee

Find Process Steps

MATCH (s:CodeSymbol)-[r:STEP_IN_PROCESS]->(p:Process)
WHERE p.label = "UserLogin"
RETURN s.name, r.step, s.file_path
ORDER BY r.step

Find Entry Points to a Community

MATCH (external:CodeSymbol)-[:CALLS]->(entry:CodeSymbol),
      (entry)-[:MEMBER_OF]->(c:Community),
      (external)-[:MEMBER_OF]->(other:Community)
WHERE c.label = "Authentication" AND c <> other
RETURN DISTINCT entry.name, external.name, other.label

Aggregations

Count Functions by Type

MATCH (s:CodeSymbol)
RETURN s.symbol_type, count(*) as count
ORDER BY count DESC

Count Calls Per Function

MATCH (s:CodeSymbol)
OPTIONAL MATCH (s)-[:CALLS]->(callee)
RETURN s.name, count(callee) as outgoing_calls
ORDER BY outgoing_calls DESC
LIMIT 10

Community Statistics

MATCH (c:Community)
OPTIONAL MATCH (s:CodeSymbol)-[:MEMBER_OF]->(c)
RETURN c.label, count(s) as symbol_count, c.cohesion
ORDER BY symbol_count DESC

Filtering

String Matching

// Contains
MATCH (s:CodeSymbol)
WHERE s.name CONTAINS "auth"
RETURN s.name

// Starts with
MATCH (s:CodeSymbol)
WHERE s.name STARTS WITH "handle"
RETURN s.name

// Regular expression
MATCH (s:CodeSymbol)
WHERE s.name =~ ".*User.*"
RETURN s.name

Numeric Comparisons

MATCH (c:Community)
WHERE c.cohesion > 0.8
RETURN c.label, c.cohesion

List Operations

MATCH (s:CodeSymbol)
WHERE s.symbol_type IN ["Function", "Method"]
RETURN s.name

Best Practices

1. Start Specific

Begin with specific matches, then broaden:

// Good - specific starting point
MATCH (s:CodeSymbol {name: "authenticate"})-[:CALLS]->(callee)
RETURN callee

// Avoid - too broad
MATCH (s)-[:CALLS]->(callee)
RETURN s, callee

2. Limit Results

Always limit when exploring:

MATCH (s:CodeSymbol)
RETURN s.name
LIMIT 100

3. Use Indexes

Query on indexed properties (name, file_path) for performance.

4. Check Schema First

Use the schema resource to verify property names:

GET db://schema/{repository}

5. Comment Your Queries

// Find all authentication functions called by API handlers
MATCH (api:CodeSymbol)-[:CALLS]->(auth:CodeSymbol)
WHERE api.file_path CONTAINS "/api/"
  AND auth.name CONTAINS "auth"
RETURN api.name, auth.name

Troubleshooting

No Results

  • Check property names against schema
  • Verify node labels are correct
  • Try broader matching first

Slow Queries

  • Add more specific WHERE clauses
  • Use LIMIT during exploration
  • Start from specific nodes, not broad patterns

Property Not Found

  • Property names are case-sensitive
  • Check schema for exact names
  • Some properties may be optional

On this page