Beyond basic concatenation and case conversion, MySQL offers functions for precision string manipulation — essential for data cleaning, ETL (Extract, Transform, Load) processes, and report formatting.
Advanced String Functions: Higher-order text manipulation tools used for finding positions, padding strings to fixed lengths, and extracting data from either end of a string.
| Function | Description | Example | Result |
|---|---|---|---|
REVERSE(str) |
Flips a string backward | REVERSE('SQL') |
'LQS' |
LOCATE(sub, str) |
Returns position of substring | LOCATE('@', 'a@b.com') |
2 |
LEFT(str, n) |
Returns first n characters |
LEFT('Database', 4) |
'Data' |
RIGHT(str, n) |
Returns last n characters |
RIGHT('Database', 4) |
'base' |
LPAD(str, n, pad) |
Pads left to length n |
LPAD('42', 5, '0') |
'00042' |
RPAD(str, n, pad) |
Pads right to length n |
RPAD('42', 5, '*') |
'42***' |
In enterprise systems, you often deal with:
- Fixed-width files: Legacy systems often require IDs to be exactly 10 digits (
0000001234). - Data Masking: Displaying only the last 4 digits of a credit card (
************4455). - Parsing: Finding a specific delimiter (like
@or,) to chop up a string.
When using LPAD or RPAD, if the original string is longer than the specified length n, MySQL will truncate the string to fit the length.
SELECT LPAD('Enterprise', 5, '*');
-- Result: 'Enter' (The string was truncated!)Professional developers always check that the target length covers the maximum possible length of the data to avoid accidental truncation.
Example 1: Credit Card Masking
-- Mask a 16-digit card, showing only the last 4 digits
SELECT
CONCAT(REPEAT('*', 12), RIGHT(Card_Number, 4)) AS Masked_Card
FROM Transactions;
-- Result: ************4455Example 2: Fixed-Width ID Generation
-- Ensuring every Invoice ID is exactly 10 characters for a bank file
SELECT LPAD(Invoice_ID, 10, '0') AS Bank_Padded_ID
FROM Invoices;
-- Result: '0000009842'-- 1. Find the domain from an email address
-- Using LOCATE to find @ and SUBSTRING to get everything after it
SELECT
SUBSTRING(Email, LOCATE('@', Email) + 1) AS Domain
FROM Users;
-- 2. Reverse a string (Useful for palindromes or obscure encoding tasks)
SELECT REVERSE('racecar'); -- 'racecar'
-- 3. Extracting file extension
-- (Finding the last '.' is harder in pure SQL, often uses REVERSE + LOCATE trick)
SELECT REVERSE(LEFT(REVERSE('manual.pdf'), LOCATE('.', REVERSE('manual.pdf')) - 1)) AS Extension;- Confusing
LEFT/RIGHTwithSUBSTRING:LEFT(str, 3)is just a shorthand forSUBSTRING(str, 1, 3). They are interchangeable, butLEFTis more readable for simple prefix extraction. - Locating non-existent substrings:
LOCATE('x', 'abc')returns 0 (not NULL). Always check for 0 results when usingLOCATEin logic.
Handling Multi-value columns (The "Anti-Pattern" Search):
Sometimes developers store comma-separated values in a single column (e.g., Tags = 'sql,mysql,db'). To find if 'mysql' is in that string, LOCATE is more efficient than LIKE:
SELECT * FROM Table WHERE LOCATE('mysql', Tags) > 0;Note: This is still an architectural anti-pattern! You should use a mapping table instead (Normalization - Phase 16).
- Task 1: Write a query that pads the
Emp_IDto exactly 8 digits using leading zeros. - Task 2: Use
LOCATEandSUBSTRINGto extract the username part (before the@) of an email address.