11 September 2012
The Correct Cucumber Regex for Matching Text in Quotes
What’s the correct regular expression to use for matching text within quotes
of a Cucumber step? I argue that It’s not the one suggested by Cucumber
itself.
If I run a step missing step:
When I click "Edit"
Cucumber will suggest:
When /^I click "(.*?)"$/ do |arg1|
pending # express the regexp above with the code you wish you had
end
The practice for most is to copy that suggestion into their step definition
file, and implement the step.
Often I’ve run into trouble when I’ve needed to create a slightly more
specific version of that step in a succeeding test:
When I click "Edit" for "Jack"
Running the step, I’ll get:
no link or button Edit" for "Jack found (Capybara::ElementNotFound)
The regex defined in our initial step is a valid match for this step,
unintentionally. The regex matches everything between a quote character, and
the last quote character in the string. In this case ‘Edit” for “Jack’.
As proof, attempting to implement the new step using Cucumber’s recommended
format yields:
Ambiguous match of "I click "Edit" for "Jack"":
features/step_definitions/running_stuff_steps.rb:12:in `/^I click "(.*?)"$/
features/step_definitions/running_stuff_steps.rb:16:in `/^I click "(.*?)" for "(.*?)"$/
Both regexes are valid matches for that step.
So we need a regular expression to uniquely match each step. The problem regex
says “match everything between the first and last quote characters”. We need
it to say “match text between a quote and the next quote, regardless of where
the end of the line is”. Or, in regular expression language, not this:
/^I click "(.*?)"$/
This:
/^I click "([^"]*)"$/
This isn’t rocket science, but I’ve witnessed a few smart people scratching
their heads on this one, not wanting to spend the mental energy to switch from
whatever feature they are working on to dissect regular expressions.
Is there any good reason to default to the former over the latter? Is a pull
request warranted?