We have these lines:
Cover.jpg
01 City Ruins.flac
02 Amusement Park.flac
03 A Beautiful Song.flac
04 Alien Manifestation.flac
05 The Tower.flac
06 Dependent Weakling.flac
07 Bipolar Nightmare.flac
08 Mourning.flac
09 The Sound of the End.flac
10 Weight of the World.flac
We want this:
cover.jpg
city-ruins.flac
amusement-park.flac
a-beautiful-song.flac
alien-manifestation.flac
the-tower.flac
dependent-weakling.flac
bipolar-nightmare.flac
mourning.flac
the-sound-of-the-end.flac
weight-of-the-world.flac
How do we accomplish this?
A regular expression is a sequence of characters that represents a search pattern
Let’s use regexr.com to try these out.
Sequences of letters are a search pattern for those letters. By default they’re case-sensitive.
Use [] to denote sets of things.
Use . to denote any character.
th searches for th:
The quick brown fox jumps over the lazy dog.
o[vwx] searches for ov, ow, and ox:
The quick brow fox jumps over the lazy dog.
o. matches o and then any character, including whitespace:
The quick brow fox jumps over the lazy dog.
o[a-z] matches o and then a character a, b, c, … z:
boa lobby location of OJ opening soyoz o0 o1 o2 o5 o9 oH
o[1-8] matches o and then a digit from 1 to 8:
boa lobby location of OJ opening soyoz o0 o1 o2 o5 o9 oH
Multipliers act on the item to the left.
* matches an item zero or more times.
+ matches an item one or more times.
? matches an item zero or one times.
lo* matches l and then zero or more o:
Are you looking at the lock or the silk?
lo+ matches l then one or more o:
Are you looking at the lock or the silk?
lo? matches l and then an o or nothing:
Are you looking at the lock or the silk?
Typically, the \ is used to escape metacharacters like ., * or ].
\\ escapes a \.
See the difference between n. and n\. below:
n. matches n and then any character:
An expression.
n\. matches n and then a period (.):
An expression.
Most decent text editors and IDEs offer search and replace based on regular expressions.
Use () to “capture” text and use in the replacement with $1, $2, etc.
L(.*?)(\s.) “captures” the highlighted text:
Look over there!
We tell our editor/command to replace the captured text with something such as ABC$1123$2$2 and get:
ABCook123 o oover there!
We’ll write search/replacement regular expression as s/<search>/<replacement>/
A couple of regular expressions and replacement expressions:
s/^\d{2}\s//s/\s/-/Regular expressions are immediately useful in many situations:
Integral to many systems:
Suppose we renamed the class User to Account or something. We can do a regex search and replace on the following code segment to make that change in another file:
s/User([^A-Za-z0-9])/Account$1/ or s/User(?![A-Za-z0-9])/Account/
@Service
@Transactional
public class UserService {
@Autowired
private UserRepository userRepository;
@Autowired
private BCryptPasswordEncoder encoder;
...
public List<User> getAll() {
return userRepository.findAll();
}
public User getFromId(UUID id) {
Optional<User> found = userRepository.findById(id);
if (!found.isPresent()) {
throw createUserNotFoundException(id);
}
return found.get();
}
...
}Try to write a regular expression that matches timestamps in the format H:mm:ss.
Valid examples:
00:00:0005:09:2815:31:0923:59:59Invalid examples:
24:00:0000:60:0000:00:60A correct response:
(?:(?:[01][0-9])|(?:2[0-4]))(?::[0-5][0-9]){2}
Regex Golf: alf.nu/RegexGolf
Regexone: regexone.com