
LLM rules for PostgreSQL
Postgres is incredibly versatile. From storing vectors to power your LLMs, to using it as a task queue. Simple, versatile, and extensible.
This comes at a cost. As you add more complexity and start growing your team, it becomes harder to manage your database.
Right now, LLMs help with generating SQL and code for popular LLMs. However, it’s still difficult to catch Postgres performance issues.
Linter rules are hard to write so most of the review comes from a team member that holds all the tribal knowledge about the database.
This is where LLMs come in. They are:
- Customizable. Can check for anything you ask it to.
- Quick to implement. Usually one well-written text prompt will suffice.
LLMs shine where linter rules can’t be written. This makes them a great use case for managing Postgres.
Should I add rules in code review or code generation?
Rules for code generation and code review are similar, but not the same.
In my experience, the best place for LLM rules for Postgres is to have them run in code review. From my experience building the wispbit code reviewer, you have more control over how the rules get enforced.
That being said, the following rules are still a good foundation for both use cases. They are especially effective when working across teams and mature codebases:
Preventing table locks by enforcing concurrent indexes
Table locks cause downtime and affect your bottom line. Only add concurrent indexes:
When creating indexes in PostgreSQL, always use the `CONCURRENTLY` option to prevent blocking writes during index creation.
Bad:
```sql
CREATE INDEX idx_users_email ON users(email);
```
Good:
```sql
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);
```
Preventing downtime from missing columns
In most setups, removing a column causes brief downtime because your ORM uses that column before the code is deployed.
By ensuring the column is ignored, the brief moment of downtime is prevented.
The following is an example using the Prisma ORM but you can apply this to any ORM that supports ignoring columns:
When creating a migration to remove a column from the database, ensure that the schema has the `@ignore` attribute on that column in the Prisma schema. Search for the .prisma schema in the codebase to verify this.
```sql
ALTER TABLE "User" DROP COLUMN "createdAt";
ALTER TABLE "User" DROP COLUMN "updatedAt";
```
```prisma
model User {
id Int @id @default(autoincrement())
email String @unique
password String
remarks String?
createdAt DateTime @default(now()) @ignore
updatedAt DateTime @updatedAt @ignore
}
```
Ensure critical indexes always exist
By default, foreign key constraints do not add indexes. With large tables, this can be problematic. This rule enforces adding an index for each foreign key.
When adding a foreign key constraint in PostgreSQL, always add a corresponding index.
Bad:
```sql
ALTER TABLE orders
ADD CONSTRAINT fk_orders_user_id
FOREIGN KEY (user_id) REFERENCES users(id);
```
Good:
```sql
ALTER TABLE orders
ADD CONSTRAINT fk_orders_user_id
FOREIGN KEY (user_id) REFERENCES users(id);
CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders (user_id);
```
Ensure table column consistency
Ensuring tables all have similar schemas makes it easier for engineers to jump in and contribute to your code.
All CREATE TABLE statements must include `id`, `created_at`, and `updated_at` columns.
Bad:
```sql
CREATE TABLE users (
email VARCHAR(255) NOT NULL UNIQUE,
name VARCHAR(100) NOT NULL
);
```
Good:
```sql
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) NOT NULL UNIQUE,
name VARCHAR(100) NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() NOT NULL,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() NOT NULL
);
```
Ensure indexes are not already covered
A composite index like email_username_idx
covers both email
and email, username
query patterns. It ensures that an index is not made for email
.
This rule is specific to SQLAlchemy but you can ask an LLM to adapt it to your favorite ORM. wispbit lets you adapt rules when you add them.
Ensure that individual column indexes in SQLAlchemy are not covered by existing composite indexes.
Bad:
```python
class User(Base):
__tablename__ = "users"
__table_args__ = (
Index("email_username_idx", "email", "username"),
)
id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str] = mapped_column(String(255), index=True) # covered by email_username_idx
username: Mapped[str] = mapped_column(String(100))
status: Mapped[str] = mapped_column(String(20))
```
Good:
```python
class User(Base):
__tablename__ = "users"
# Only composite index needed - covers both email and email+username queries
__table_args__ = (
Index("email_username_idx", "email", "username"),
)
id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str] = mapped_column(String(255))
username: Mapped[str] = mapped_column(String(100))
status: Mapped[str] = mapped_column(String(20))
```
Ensure queries are covered by indexes
If you write a query, it should be covered by an index. This rule makes sure that each query has a corresponding index.
Again, this is written in SQLAlchemy but you can ask an LLM to adapt it to your ORM or the wispbit interface.
Ensure that SQLAlchemy query patterns are covered by appropriate database indexes.
Bad:
```python
# queries.py
def get_user_by_email(session: Session, email: str):
# Will perform full table scan - no index on email
return session.scalar(select(User).where(User.email == email))
# models.py
class User(Base):
__tablename__ = 'users'
id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str] = mapped_column(String(255)) # No index
status: Mapped[str] = mapped_column(String(50))
created_at: Mapped[datetime] = mapped_column(DateTime)
```
Good:
```python
# queries.py
def get_user_by_email(session: Session, email: str):
# Is covered by an index
return session.scalar(select(User).where(User.email == email))
# models.py
class User(Base):
__tablename__ = 'users'
id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str] = mapped_column(String(255), index=True)
status: Mapped[str] = mapped_column(String(50))
created_at: Mapped[datetime] = mapped_column(DateTime)
```
You can find more rules on the wispbit rules page.
Coming up with your own rules
Think of the last time you:
- Had downtime as a result of a code change
- Commented on a pull request
Turn that into a rule your team can agree on.
Postgres is incredibly versatile and un-opinionated. That’s what makes it so great. But when you start bringing other engineers into the mix - enforcing a set of standards becomes mandatory.