We have these lines:
Cover.jpg
01 City Ruins.flac
02 Amusement Park.flac
03 A Beautiful Song.flac
04 Alien Manifestation.flac
05 The Tower.flac
06 Dependent Weakling.flac
07 Bipolar Nightmare.flac
08 Mourning.flac
09 The Sound of the End.flac
10 Weight of the World.flac
We want this:
cover.jpg
city-ruins.flac
amusement-park.flac
a-beautiful-song.flac
alien-manifestation.flac
the-tower.flac
dependent-weakling.flac
bipolar-nightmare.flac
mourning.flac
the-sound-of-the-end.flac
weight-of-the-world.flac
How do we accomplish this?
A regular expression is a sequence of characters that represents a search pattern
Let’s use regexr.com to try these out.
Sequences of letters are a search pattern for those letters. By default they’re case-sensitive.
Use []
to denote sets of things.
Use .
to denote any character.
th
searches for th
:
The quick brown fox jumps over the lazy dog.
o[vwx]
searches for ov
, ow
, and ox
:
The quick brow fox jumps over the lazy dog.
o.
matches o
and then any character, including whitespace:
The quick brow fox jumps over the lazy dog.
o[a-z]
matches o and then a character a
, b
, c
, … z
:
boa lobby location of OJ opening soyoz o0 o1 o2 o5 o9 oH
o[1-8]
matches o and then a digit from 1
to 8
:
boa lobby location of OJ opening soyoz o0 o1 o2 o5 o9 oH
Multipliers act on the item to the left.
*
matches an item zero or more times.
+
matches an item one or more times.
?
matches an item zero or one times.
lo*
matches l
and then zero or more o
:
Are you looking at the lock or the silk?
lo+
matches l
then one or more o
:
Are you looking at the lock or the silk?
lo?
matches l
and then an o
or nothing:
Are you looking at the lock or the silk?
Typically, the \
is used to escape metacharacters like .
, *
or ]
.
\\
escapes a \
.
See the difference between n.
and n\.
below:
n.
matches n
and then any character:
An expression.
n\.
matches n
and then a period (.
):
An expression.
Most decent text editors and IDEs offer search and replace based on regular expressions.
Use ()
to “capture” text and use in the replacement with $1
, $2
, etc.
L(.*?)(\s.)
“captures” the highlighted text:
Look over there!
We tell our editor/command to replace the captured text with something such as ABC$1123$2$2
and get:
ABCook123 o oover there!
We’ll write search/replacement regular expression as s/<search>/<replacement>/
A couple of regular expressions and replacement expressions:
s/^\d{2}\s//
s/\s/-/
Regular expressions are immediately useful in many situations:
Integral to many systems:
Suppose we renamed the class User
to Account
or something. We can do a regex search and replace on the following code segment to make that change in another file:
s/User([^A-Za-z0-9])/Account$1/
or s/User(?![A-Za-z0-9])/Account/
@Service
@Transactional
public class UserService {
@Autowired
private UserRepository userRepository;
@Autowired
private BCryptPasswordEncoder encoder;
...
public List<User> getAll() {
return userRepository.findAll();
}
public User getFromId(UUID id) {
Optional<User> found = userRepository.findById(id);
if (!found.isPresent()) {
throw createUserNotFoundException(id);
}
return found.get();
}
...
}
Try to write a regular expression that matches timestamps in the format H:mm:ss
.
Valid examples:
00:00:00
05:09:28
15:31:09
23:59:59
Invalid examples:
24:00:00
00:60:00
00:00:60
A correct response:
(?:(?:[01][0-9])|(?:2[0-4]))(?::[0-5][0-9]){2}
Regex Golf: alf.nu/RegexGolf
Regexone: regexone.com