|
| 1 | +# CLAUDE.md - SQL4Json |
| 2 | + |
| 3 | +## Project Overview |
| 4 | + |
| 5 | +SQL4Json is a Java library that enables SQL-like querying of JSON data. It parses SQL SELECT statements via an ANTLR4 grammar and applies them to JSON objects — supporting filtering, aggregation, sorting, and nested queries without a database. |
| 6 | + |
| 7 | +- **Language:** Java 8+ (source/target 1.8) |
| 8 | +- **Build:** Maven |
| 9 | +- **Version:** 0.0.2 |
| 10 | +- **License:** Apache 2.0 |
| 11 | +- **Package:** `io.github.mnesimiyilmaz.sql4json` |
| 12 | + |
| 13 | +## Build & Test Commands |
| 14 | + |
| 15 | +```bash |
| 16 | +# Compile |
| 17 | +mvn clean compile |
| 18 | + |
| 19 | +# Run tests |
| 20 | +mvn clean test |
| 21 | + |
| 22 | +# Package (creates jar-with-dependencies) |
| 23 | +mvn clean package |
| 24 | + |
| 25 | +# Full build skipping tests |
| 26 | +mvn clean package -DskipTests |
| 27 | +``` |
| 28 | + |
| 29 | +Tests use JUnit 5 (Jupiter). The single test class `SQL4JsonQueryTests` contains ~19 tests covering SELECT, WHERE, GROUP BY, HAVING, ORDER BY, functions, and nested queries. |
| 30 | + |
| 31 | +## Project Structure |
| 32 | + |
| 33 | +``` |
| 34 | +src/ |
| 35 | +├── main/ |
| 36 | +│ ├── antlr4/.../generated/ |
| 37 | +│ │ └── SQL4Json.g4 # ANTLR grammar (do NOT hand-edit generated code) |
| 38 | +│ └── java/.../sql4json/ |
| 39 | +│ ├── SQL4JsonProcessor.java # Main entry point - orchestrates query execution |
| 40 | +│ ├── SQL4JsonInput.java # Input wrapper (fromObject, fromJsonString, fromJsonNodeSupplier) |
| 41 | +│ ├── SQL4JsonListenerImpl.java # ANTLR parse tree listener |
| 42 | +│ ├── condition/ # WHERE/HAVING condition AST |
| 43 | +│ │ ├── CriteriaNode.java # Interface for condition evaluation |
| 44 | +│ │ ├── ComparisonNode.java # Leaf node - single comparison |
| 45 | +│ │ ├── AndNode.java # AND logical operator |
| 46 | +│ │ ├── OrNode.java # OR logical operator |
| 47 | +│ │ └── ConditionProcessor.java # Shunting Yard algorithm - infix to AST |
| 48 | +│ ├── definitions/ # Column and aggregation definitions |
| 49 | +│ │ ├── SelectColumnDefinition.java |
| 50 | +│ │ ├── JsonColumnWithAggFunctionDefinion.java # Note: "Definion" typo is intentional |
| 51 | +│ │ ├── JsonColumnWithNonAggFunctionDefinion.java |
| 52 | +│ │ └── OrderByColumnDefinion.java |
| 53 | +│ ├── processor/ # SQL query pipeline |
| 54 | +│ │ ├── SQLProcessor.java # Chains SQLBuilders for nested queries |
| 55 | +│ │ ├── SQLBuilder.java # Holds parsed clauses per query level |
| 56 | +│ │ └── SQLConstruct.java # Applies query to flattened data |
| 57 | +│ ├── grouping/ # GROUP BY processing |
| 58 | +│ │ ├── GroupByProcessor.java |
| 59 | +│ │ ├── GroupByInput.java |
| 60 | +│ │ └── GroupRowData.java |
| 61 | +│ ├── sorting/ |
| 62 | +│ │ └── SortProcessor.java # ORDER BY comparator builder |
| 63 | +│ └── utils/ # Utilities |
| 64 | +│ ├── JsonUtils.java # JSON flatten/unflatten (core utility, ~250 lines) |
| 65 | +│ ├── FieldKey.java # Flattened field key with family tracking |
| 66 | +│ ├── AggregateFunction.java |
| 67 | +│ ├── AggregationUtils.java |
| 68 | +│ ├── ComparisonOperator.java |
| 69 | +│ ├── ComparisonUtils.java |
| 70 | +│ ├── ValueUtils.java |
| 71 | +│ ├── ParameterizedFunctionUtils.java |
| 72 | +│ ├── ValueFunctionUtils.java |
| 73 | +│ ├── AntlrSyntaxErrorListener.java |
| 74 | +│ └── MurmurHash3.java |
| 75 | +└── test/ |
| 76 | + └── java/.../sql4json/ |
| 77 | + ├── SQL4JsonQueryTests.java # Main test suite |
| 78 | + └── dataclasses/ # Test POJOs (Person, Account, LoginHistory, SomeLoginData) |
| 79 | +``` |
| 80 | + |
| 81 | +## Architecture & Data Flow |
| 82 | + |
| 83 | +``` |
| 84 | +Input JSON |
| 85 | + → SQL4JsonInput (wraps as Jackson JsonNode) |
| 86 | + → SQL4JsonProcessor (entry point) |
| 87 | + → SQLProcessor (handles nested queries via ">>>" splitting) |
| 88 | + → SQLBuilder (ANTLR parses SQL into clause objects) |
| 89 | + → SQLConstruct (executes query): |
| 90 | + 1. Flatten JSON to Map<FieldKey, Object> |
| 91 | + 2. WHERE filter (ConditionProcessor → CriteriaNode AST) |
| 92 | + 3. GROUP BY (GroupByProcessor with aggregation functions) |
| 93 | + 4. HAVING filter |
| 94 | + 5. ORDER BY (SortProcessor) |
| 95 | + 6. SELECT projection (column selection/aliasing) |
| 96 | + 7. Unflatten back to JsonNode |
| 97 | + → Returns ArrayNode result |
| 98 | +``` |
| 99 | + |
| 100 | +## Key Dependencies |
| 101 | + |
| 102 | +| Dependency | Version | Purpose | |
| 103 | +|-----------|---------|---------| |
| 104 | +| ANTLR 4 | 4.9.3 | SQL grammar parsing (pinned for Java 8 compat) | |
| 105 | +| Jackson | 2.15.2 | JSON processing (databind, dataformat-xml, date/time modules) | |
| 106 | +| Lombok | 1.18.30 | Boilerplate reduction (@Getter, @Setter, @RequiredArgsConstructor) | |
| 107 | +| JUnit 5 | 5.10.0 | Testing | |
| 108 | + |
| 109 | +## Code Conventions |
| 110 | + |
| 111 | +- **Naming:** CamelCase classes/methods, UPPER_SNAKE_CASE constants |
| 112 | +- **Known typo:** `Definion` (not `Definition`) in class names — this is consistent throughout and should not be "fixed" |
| 113 | +- **Lombok:** Used extensively — `@Getter`, `@Setter`, `@RequiredArgsConstructor`, `@AllArgsConstructor` |
| 114 | +- **Functional style:** Heavy use of Java 8 streams, `BiPredicate`, `Function`, `Supplier`, `Optional` |
| 115 | +- **JSON flattening:** Nested JSON is flattened to `Map<FieldKey, Object>` for processing, then unflattened for output. `FieldKey` tracks the "family" (base path) for nested field grouping. |
| 116 | +- **ANTLR generated code:** Located in `target/generated-sources/antlr4/`. Never edit generated files — modify `SQL4Json.g4` instead. |
| 117 | + |
| 118 | +## SQL Syntax Supported |
| 119 | + |
| 120 | +- `SELECT *`, specific columns, aliases (`AS`), aggregate functions |
| 121 | +- `FROM $r` (root reference), nested paths (`$r.data.items`) |
| 122 | +- `WHERE` with `=`, `!=`, `<`, `>`, `<=`, `>=`, `LIKE`, `IS NULL`, `IS NOT NULL`, `AND`, `OR`, parentheses |
| 123 | +- `GROUP BY` with `HAVING` |
| 124 | +- `ORDER BY` with `ASC`/`DESC` |
| 125 | +- **Nested queries:** `>>>` operator or subquery in FROM clause |
| 126 | +- **Functions:** `LOWER()`, `UPPER()`, `COALESCE()`, `TO_DATE()`, `NOW()`, `COUNT()`, `SUM()`, `AVG()`, `MIN()`, `MAX()` |
| 127 | + |
| 128 | +## Development Notes |
| 129 | + |
| 130 | +- No CI/CD pipeline configured — run `mvn clean test` locally before committing |
| 131 | +- ANTLR plugin runs during `generate-sources` phase — grammar changes require a rebuild |
| 132 | +- The project targets Java 8 compatibility; avoid Java 9+ APIs |
| 133 | +- Publishing is configured for OSSRH (snapshots) and GitHub Packages via Maven profiles (`ossrh-snapshot`, `github`, `release`) |
0 commit comments