Chemical Formula Parser

As an exercise, I wrote a simple context free parser modeled from a pushdown automaton for chemical formulas such as NO2 or Fe(NO3)3.

To explain a bit about the decisions made:

The state functions (pstart, pchem and pnum) are lambdas because they carry around a lot of context. They could be methods of a class, it doesn't really matter.

The state functions have the input character injected because I realized halfway that I have to process some more after the line ended and I don't want to duplicate the state functions.

I am mostly concerned about the general messiness of the parser. There is no obvious way to know where variables are mutated, control flow is all over the place and consequently I find it difficult to reason about the program. Is there a better code structure?

Finally, I'd like to know if I caught all illegal syntax. Note repeating chemicals like NN is allowed.

#include<iostream>

#include<stack>

#include<unordered_map>

#include<string>

#include<iterator>



using namespace std::literals;



enum class state_t { start, num, chem, error };



double parse(std::string line, const std::unordered_map<std::string, double>& m)

{

    auto b = line.begin(), e = line.end();

    std::stack<double> stk;

    int num = 0;

    std::string chem;

    state_t state = state_t::start;



    auto pstart = [&](char c) {

        if(std::isupper(c))

        {

            chem = ""s + c;

            state = state_t::chem;

            return true;

        }

        else if(std::isdigit(c))

        {

            if(stk.empty())

                state = state_t::error;

            else

            {

                num = c - '0';

                state = state_t::num;

                return true;

            }

        }

        else if(c == '(')

        {

            stk.push(-1);

            return true;

        }

        else if(c == ')')

        {

            double sum = 0;

            while(!stk.empty() && stk.top() > 0)

            {

                sum += stk.top();

                stk.pop();

            }

            if(stk.empty())

                state = state_t::error;

            else

            {

                stk.pop();

                stk.push(sum);

                return true;

            }

        }

        else

            state = state_t::error;



        return false;

    };



    auto pnum = [&](char c) {

        if(std::isdigit(c))

        {

            num = 10 * num + c - '0';

            return true;

        }

        else

        {

            stk.top() *= num;

            state = state_t::start;

        }

        return false;

    };



    auto pchem = [&](char c){

        if(std::islower(c))

        {

            chem += c;

            return true;

        }

        else

        {

            if(auto it = m.find(chem); it != m.end())

            {

                stk.push(it->second);

                state = state_t::start;

            }

            else

                state = state_t::error;

        }

        return false;

    };



    while(b != e)

    {

        switch(state)

        {

        case state_t::start:

            if(pstart(*b))

                b++;

            break;

        case state_t::num:

            if(pnum(*b))

                b++;

            break;

        case state_t::chem:

            if(pchem(*b))

                b++;

            break;

        default:

            return -1;

        }

    }



    switch(state)

    {

    case state_t::num:

        pnum('n');

        break;

    case state_t::chem:

        pchem('n');

        break;

    }



    if(state == state_t::error)

        return -1;



    double sum = 0;

    while(!stk.empty() && stk.top() > 0)

    {

        sum += stk.top();

        stk.pop();

    }

    if(stk.size() > 0)  // expected ')'

        return -1;



    return sum;

}



int main()

{

    std::unordered_map<std::string, double> m;

    m["Na"] = 23.5;

    m["Cl"] = 35.5;

    m["O"] = 16;

    m["N"] = 14;

    m["Fe"] = 55.8;



    std::string line;

    while(getline(std::cin, line))

        std::cout << parse(std::move(line), m) << 'n';

}

edited Jan 27 at 14:38

asked Jan 27 at 10:26

友人A

436

1

$begingroup$
I realized if(stk.size() > 1) should be if(stk.size() > 0), should I edit?
$endgroup$
– 友人A
Jan 27 at 11:08

2

$begingroup$
Feel free to edit as long as there are no answers. But try to not edit as time goes by.
$endgroup$
– Incomputable
Jan 27 at 11:26

2

$begingroup$
Note, the singular is “automaton”.
$endgroup$
– Konrad Rudolph
Jan 27 at 14:26

1

$begingroup$
There is a standard for writing chemical formulas to be machine parsable so that they are not ambiguous. Its called SMILE Simple Molecule Input Line Entry. The trouble with your system is that for more complex molecules you can't be unambiguous.
$endgroup$
– Martin York
Jan 27 at 18:50

1

$begingroup$
You said that you wrote a parser but the result of the function is double so clearly you’re doing way more than just parsing, and it’s not at all clear (from the names, nor the usage) what transformation you’re performing. I’m guessing you are calculating the molecular weight? In which case the function parse should be renamed appropriately.
$endgroup$
– Konrad Rudolph
Jan 28 at 11:32

|
show 2 more comments

As an exercise, I wrote a simple context free parser modeled from a pushdown automaton for chemical formulas such as NO2 or Fe(NO3)3.

To explain a bit about the decisions made:

The state functions (pstart, pchem and pnum) are lambdas because they carry around a lot of context. They could be methods of a class, it doesn't really matter.

The state functions have the input character injected because I realized halfway that I have to process some more after the line ended and I don't want to duplicate the state functions.

Finally, I'd like to know if I caught all illegal syntax. Note repeating chemicals like NN is allowed.

#include<iostream>

#include<stack>

#include<unordered_map>

#include<string>

#include<iterator>



using namespace std::literals;



enum class state_t { start, num, chem, error };



double parse(std::string line, const std::unordered_map<std::string, double>& m)

{

    auto b = line.begin(), e = line.end();

    std::stack<double> stk;

    int num = 0;

    std::string chem;

    state_t state = state_t::start;



    auto pstart = [&](char c) {

        if(std::isupper(c))

        {

            chem = ""s + c;

            state = state_t::chem;

            return true;

        }

        else if(std::isdigit(c))

        {

            if(stk.empty())

                state = state_t::error;

            else

            {

                num = c - '0';

                state = state_t::num;

                return true;

            }

        }

        else if(c == '(')

        {

            stk.push(-1);

            return true;

        }

        else if(c == ')')

        {

            double sum = 0;

            while(!stk.empty() && stk.top() > 0)

            {

                sum += stk.top();

                stk.pop();

            }

            if(stk.empty())

                state = state_t::error;

            else

            {

                stk.pop();

                stk.push(sum);

                return true;

            }

        }

        else

            state = state_t::error;



        return false;

    };



    auto pnum = [&](char c) {

        if(std::isdigit(c))

        {

            num = 10 * num + c - '0';

            return true;

        }

        else

        {

            stk.top() *= num;

            state = state_t::start;

        }

        return false;

    };



    auto pchem = [&](char c){

        if(std::islower(c))

        {

            chem += c;

            return true;

        }

        else

        {

            if(auto it = m.find(chem); it != m.end())

            {

                stk.push(it->second);

                state = state_t::start;

            }

            else

                state = state_t::error;

        }

        return false;

    };



    while(b != e)

    {

        switch(state)

        {

        case state_t::start:

            if(pstart(*b))

                b++;

            break;

        case state_t::num:

            if(pnum(*b))

                b++;

            break;

        case state_t::chem:

            if(pchem(*b))

                b++;

            break;

        default:

            return -1;

        }

    }



    switch(state)

    {

    case state_t::num:

        pnum('n');

        break;

    case state_t::chem:

        pchem('n');

        break;

    }



    if(state == state_t::error)

        return -1;



    double sum = 0;

    while(!stk.empty() && stk.top() > 0)

    {

        sum += stk.top();

        stk.pop();

    }

    if(stk.size() > 0)  // expected ')'

        return -1;



    return sum;

}



int main()

{

    std::unordered_map<std::string, double> m;

    m["Na"] = 23.5;

    m["Cl"] = 35.5;

    m["O"] = 16;

    m["N"] = 14;

    m["Fe"] = 55.8;



    std::string line;

    while(getline(std::cin, line))

        std::cout << parse(std::move(line), m) << 'n';

}

edited Jan 27 at 14:38

asked Jan 27 at 10:26

友人A

436

1

$begingroup$
I realized if(stk.size() > 1) should be if(stk.size() > 0), should I edit?
$endgroup$
– 友人A
Jan 27 at 11:08

2

$begingroup$
Feel free to edit as long as there are no answers. But try to not edit as time goes by.
$endgroup$
– Incomputable
Jan 27 at 11:26

2

$begingroup$
Note, the singular is “automaton”.
$endgroup$
– Konrad Rudolph
Jan 27 at 14:26

1

$begingroup$
There is a standard for writing chemical formulas to be machine parsable so that they are not ambiguous. Its called SMILE Simple Molecule Input Line Entry. The trouble with your system is that for more complex molecules you can't be unambiguous.
$endgroup$
– Martin York
Jan 27 at 18:50

1

$begingroup$
You said that you wrote a parser but the result of the function is double so clearly you’re doing way more than just parsing, and it’s not at all clear (from the names, nor the usage) what transformation you’re performing. I’m guessing you are calculating the molecular weight? In which case the function parse should be renamed appropriately.
$endgroup$
– Konrad Rudolph
Jan 28 at 11:32

|
show 2 more comments

As an exercise, I wrote a simple context free parser modeled from a pushdown automaton for chemical formulas such as NO2 or Fe(NO3)3.

To explain a bit about the decisions made:

The state functions (pstart, pchem and pnum) are lambdas because they carry around a lot of context. They could be methods of a class, it doesn't really matter.

The state functions have the input character injected because I realized halfway that I have to process some more after the line ended and I don't want to duplicate the state functions.

Finally, I'd like to know if I caught all illegal syntax. Note repeating chemicals like NN is allowed.

#include<iostream>

#include<stack>

#include<unordered_map>

#include<string>

#include<iterator>



using namespace std::literals;



enum class state_t { start, num, chem, error };



double parse(std::string line, const std::unordered_map<std::string, double>& m)

{

    auto b = line.begin(), e = line.end();

    std::stack<double> stk;

    int num = 0;

    std::string chem;

    state_t state = state_t::start;



    auto pstart = [&](char c) {

        if(std::isupper(c))

        {

            chem = ""s + c;

            state = state_t::chem;

            return true;

        }

        else if(std::isdigit(c))

        {

            if(stk.empty())

                state = state_t::error;

            else

            {

                num = c - '0';

                state = state_t::num;

                return true;

            }

        }

        else if(c == '(')

        {

            stk.push(-1);

            return true;

        }

        else if(c == ')')

        {

            double sum = 0;

            while(!stk.empty() && stk.top() > 0)

            {

                sum += stk.top();

                stk.pop();

            }

            if(stk.empty())

                state = state_t::error;

            else

            {

                stk.pop();

                stk.push(sum);

                return true;

            }

        }

        else

            state = state_t::error;



        return false;

    };



    auto pnum = [&](char c) {

        if(std::isdigit(c))

        {

            num = 10 * num + c - '0';

            return true;

        }

        else

        {

            stk.top() *= num;

            state = state_t::start;

        }

        return false;

    };



    auto pchem = [&](char c){

        if(std::islower(c))

        {

            chem += c;

            return true;

        }

        else

        {

            if(auto it = m.find(chem); it != m.end())

            {

                stk.push(it->second);

                state = state_t::start;

            }

            else

                state = state_t::error;

        }

        return false;

    };



    while(b != e)

    {

        switch(state)

        {

        case state_t::start:

            if(pstart(*b))

                b++;

            break;

        case state_t::num:

            if(pnum(*b))

                b++;

            break;

        case state_t::chem:

            if(pchem(*b))

                b++;

            break;

        default:

            return -1;

        }

    }



    switch(state)

    {

    case state_t::num:

        pnum('n');

        break;

    case state_t::chem:

        pchem('n');

        break;

    }



    if(state == state_t::error)

        return -1;



    double sum = 0;

    while(!stk.empty() && stk.top() > 0)

    {

        sum += stk.top();

        stk.pop();

    }

    if(stk.size() > 0)  // expected ')'

        return -1;



    return sum;

}



int main()

{

    std::unordered_map<std::string, double> m;

    m["Na"] = 23.5;

    m["Cl"] = 35.5;

    m["O"] = 16;

    m["N"] = 14;

    m["Fe"] = 55.8;



    std::string line;

    while(getline(std::cin, line))

        std::cout << parse(std::move(line), m) << 'n';

}

edited Jan 27 at 14:38

asked Jan 27 at 10:26

友人A

436

As an exercise, I wrote a simple context free parser modeled from a pushdown automaton for chemical formulas such as NO2 or Fe(NO3)3.

To explain a bit about the decisions made:

The state functions (pstart, pchem and pnum) are lambdas because they carry around a lot of context. They could be methods of a class, it doesn't really matter.

The state functions have the input character injected because I realized halfway that I have to process some more after the line ended and I don't want to duplicate the state functions.

Finally, I'd like to know if I caught all illegal syntax. Note repeating chemicals like NN is allowed.

#include<iostream>

#include<stack>

#include<unordered_map>

#include<string>

#include<iterator>



using namespace std::literals;



enum class state_t { start, num, chem, error };



double parse(std::string line, const std::unordered_map<std::string, double>& m)

{

    auto b = line.begin(), e = line.end();

    std::stack<double> stk;

    int num = 0;

    std::string chem;

    state_t state = state_t::start;



    auto pstart = [&](char c) {

        if(std::isupper(c))

        {

            chem = ""s + c;

            state = state_t::chem;

            return true;

        }

        else if(std::isdigit(c))

        {

            if(stk.empty())

                state = state_t::error;

            else

            {

                num = c - '0';

                state = state_t::num;

                return true;

            }

        }

        else if(c == '(')

        {

            stk.push(-1);

            return true;

        }

        else if(c == ')')

        {

            double sum = 0;

            while(!stk.empty() && stk.top() > 0)

            {

                sum += stk.top();

                stk.pop();

            }

            if(stk.empty())

                state = state_t::error;

            else

            {

                stk.pop();

                stk.push(sum);

                return true;

            }

        }

        else

            state = state_t::error;



        return false;

    };



    auto pnum = [&](char c) {

        if(std::isdigit(c))

        {

            num = 10 * num + c - '0';

            return true;

        }

        else

        {

            stk.top() *= num;

            state = state_t::start;

        }

        return false;

    };



    auto pchem = [&](char c){

        if(std::islower(c))

        {

            chem += c;

            return true;

        }

        else

        {

            if(auto it = m.find(chem); it != m.end())

            {

                stk.push(it->second);

                state = state_t::start;

            }

            else

                state = state_t::error;

        }

        return false;

    };



    while(b != e)

    {

        switch(state)

        {

        case state_t::start:

            if(pstart(*b))

                b++;

            break;

        case state_t::num:

            if(pnum(*b))

                b++;

            break;

        case state_t::chem:

            if(pchem(*b))

                b++;

            break;

        default:

            return -1;

        }

    }



    switch(state)

    {

    case state_t::num:

        pnum('n');

        break;

    case state_t::chem:

        pchem('n');

        break;

    }



    if(state == state_t::error)

        return -1;



    double sum = 0;

    while(!stk.empty() && stk.top() > 0)

    {

        sum += stk.top();

        stk.pop();

    }

    if(stk.size() > 0)  // expected ')'

        return -1;



    return sum;

}



int main()

{

    std::unordered_map<std::string, double> m;

    m["Na"] = 23.5;

    m["Cl"] = 35.5;

    m["O"] = 16;

    m["N"] = 14;

    m["Fe"] = 55.8;



    std::string line;

    while(getline(std::cin, line))

        std::cout << parse(std::move(line), m) << 'n';

}

c++ parsing

edited Jan 27 at 14:38

asked Jan 27 at 10:26

友人A

436

edited Jan 27 at 14:38

asked Jan 27 at 10:26

友人A

436

edited Jan 27 at 14:38

asked Jan 27 at 10:26

友人A

436

asked Jan 27 at 10:26

友人A

436

asked Jan 27 at 10:26

友人A

436

1

$begingroup$
I realized if(stk.size() > 1) should be if(stk.size() > 0), should I edit?
$endgroup$
– 友人A
Jan 27 at 11:08

2

$begingroup$
Feel free to edit as long as there are no answers. But try to not edit as time goes by.
$endgroup$
– Incomputable
Jan 27 at 11:26

2

$begingroup$
Note, the singular is “automaton”.
$endgroup$
– Konrad Rudolph
Jan 27 at 14:26

1

$begingroup$
There is a standard for writing chemical formulas to be machine parsable so that they are not ambiguous. Its called SMILE Simple Molecule Input Line Entry. The trouble with your system is that for more complex molecules you can't be unambiguous.
$endgroup$
– Martin York
Jan 27 at 18:50

1

$begingroup$
You said that you wrote a parser but the result of the function is double so clearly you’re doing way more than just parsing, and it’s not at all clear (from the names, nor the usage) what transformation you’re performing. I’m guessing you are calculating the molecular weight? In which case the function parse should be renamed appropriately.
$endgroup$
– Konrad Rudolph
Jan 28 at 11:32

|
show 2 more comments

1

$begingroup$
I realized if(stk.size() > 1) should be if(stk.size() > 0), should I edit?
$endgroup$
– 友人A
Jan 27 at 11:08

2

$begingroup$
Feel free to edit as long as there are no answers. But try to not edit as time goes by.
$endgroup$
– Incomputable
Jan 27 at 11:26

2

$begingroup$
Note, the singular is “automaton”.
$endgroup$
– Konrad Rudolph
Jan 27 at 14:26

1

$begingroup$
There is a standard for writing chemical formulas to be machine parsable so that they are not ambiguous. Its called SMILE Simple Molecule Input Line Entry. The trouble with your system is that for more complex molecules you can't be unambiguous.
$endgroup$
– Martin York
Jan 27 at 18:50

1

$begingroup$
You said that you wrote a parser but the result of the function is double so clearly you’re doing way more than just parsing, and it’s not at all clear (from the names, nor the usage) what transformation you’re performing. I’m guessing you are calculating the molecular weight? In which case the function parse should be renamed appropriately.
$endgroup$
– Konrad Rudolph
Jan 28 at 11:32

I realized if(stk.size() > 1) should be if(stk.size() > 0), should I edit?

– 友人A
Jan 27 at 11:08

Feel free to edit as long as there are no answers. But try to not edit as time goes by.

– Incomputable
Jan 27 at 11:26

Note, the singular is “automaton”.

– Konrad Rudolph
Jan 27 at 14:26

There is a standard for writing chemical formulas to be machine parsable so that they are not ambiguous. Its called SMILE Simple Molecule Input Line Entry. The trouble with your system is that for more complex molecules you can't be unambiguous.

– Martin York
Jan 27 at 18:50

You said that you wrote a parser but the result of the function is double so clearly you’re doing way more than just parsing, and it’s not at all clear (from the names, nor the usage) what transformation you’re performing. I’m guessing you are calculating the molecular weight? In which case the function parse should be renamed appropriately.

– Konrad Rudolph
Jan 28 at 11:32

|
show 2 more comments

2 Answers
2

active

oldest

votes

Using a state machine is fine, but is usually harder to get right than writing the parser with focus on the grammar. You're also mixing a computation with the parsing which adds more information to consider when analyzing the code.
I would recommend to separate the parser code from the computation and then writing the parser strictly following the grammar you want to parse. I'll try to illustrate what I mean by giving a simplified version of your parser. Say you have a grammar:

formula = *(group)

group = element [ count ]

element = uppercase [ lowercase ]

count = "0" .. "9"

You can now give a function for each non-terminal:



// formula = *(group)   ; * becomes while (...) ...

std::list<group> parse_formula(std::stringstream& s)

{

    std::list<group> rv;



    while (!s.eof())

    {

        rv.push_back(parse_group(s));

    }



    return rv;

}



// group = element [ count ]

group parse_group(std::stringstream& s)

{

    group g;

    group.element = parse_element(s);

    try

    {

        group.count = parse_count(s);

    }

    catch (const parse_failed&)

    {

        group = 1;

    }

}



// element = uppercase [lowercase]

std::string parse_element(std::stringstream& s)

{

    if (!std::isupper(s.peek()))

    {

        throw parse_failed(...);

    }



    std::string element = s.get();;



    if (std::islower(s.peek()))

    {

        s.get(ch);

        element += ch;

    }



    return element;

}



// count = [0-9]

unsigned parse_count(std::stringstream& s)

{ 

    if (!std::isdigit(s.peek()))

    {

        throw parse_failed(...);

    }



    unsigned rv; 

    s >> rv; // this actually violates the grammar as it reads multiple digits

    return rv;

}

You can then iterate over the list of groups and compute the sum.

edited Jan 27 at 14:38

answered Jan 27 at 12:13

Cornholio

48617

add a comment |

Lambdas Too Complex

You say that the reason behind your lambdas is that they carry a lot of state around, but it is mutating plenty of it as well. Extracting your lambdas into methods in their own right would make it easier to keep track of where your state is used and where it is mutated.

Methods Should Calculate Or Mutate, Not Both

Lambdas that calculate something and then mutate something else are notoriously difficult to debug and maintain. When you create your methods, it should either calculate something (such as the atomic weight) or change something (such as changing your state).

Variable Names Unclear

Names like pstart, num and m do not tell me anything about the purpose of these variables. There is nothing wrong with LongVariableNamesThatExplainWhatTheyAreDoing.

answered Jan 27 at 12:15

Joe C

1,045211

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f212312%2fchemical-formula-parser%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

formula = *(group)

group = element [ count ]

element = uppercase [ lowercase ]

count = "0" .. "9"

You can now give a function for each non-terminal:



// formula = *(group)   ; * becomes while (...) ...

std::list<group> parse_formula(std::stringstream& s)

{

    std::list<group> rv;



    while (!s.eof())

    {

        rv.push_back(parse_group(s));

    }



    return rv;

}



// group = element [ count ]

group parse_group(std::stringstream& s)

{

    group g;

    group.element = parse_element(s);

    try

    {

        group.count = parse_count(s);

    }

    catch (const parse_failed&)

    {

        group = 1;

    }

}



// element = uppercase [lowercase]

std::string parse_element(std::stringstream& s)

{

    if (!std::isupper(s.peek()))

    {

        throw parse_failed(...);

    }



    std::string element = s.get();;



    if (std::islower(s.peek()))

    {

        s.get(ch);

        element += ch;

    }



    return element;

}



// count = [0-9]

unsigned parse_count(std::stringstream& s)

{ 

    if (!std::isdigit(s.peek()))

    {

        throw parse_failed(...);

    }



    unsigned rv; 

    s >> rv; // this actually violates the grammar as it reads multiple digits

    return rv;

}

You can then iterate over the list of groups and compute the sum.

edited Jan 27 at 14:38

answered Jan 27 at 12:13

Cornholio

48617

add a comment |

formula = *(group)

group = element [ count ]

element = uppercase [ lowercase ]

count = "0" .. "9"

You can now give a function for each non-terminal:



// formula = *(group)   ; * becomes while (...) ...

std::list<group> parse_formula(std::stringstream& s)

{

    std::list<group> rv;



    while (!s.eof())

    {

        rv.push_back(parse_group(s));

    }



    return rv;

}



// group = element [ count ]

group parse_group(std::stringstream& s)

{

    group g;

    group.element = parse_element(s);

    try

    {

        group.count = parse_count(s);

    }

    catch (const parse_failed&)

    {

        group = 1;

    }

}



// element = uppercase [lowercase]

std::string parse_element(std::stringstream& s)

{

    if (!std::isupper(s.peek()))

    {

        throw parse_failed(...);

    }



    std::string element = s.get();;



    if (std::islower(s.peek()))

    {

        s.get(ch);

        element += ch;

    }



    return element;

}



// count = [0-9]

unsigned parse_count(std::stringstream& s)

{ 

    if (!std::isdigit(s.peek()))

    {

        throw parse_failed(...);

    }



    unsigned rv; 

    s >> rv; // this actually violates the grammar as it reads multiple digits

    return rv;

}

You can then iterate over the list of groups and compute the sum.

edited Jan 27 at 14:38

answered Jan 27 at 12:13

Cornholio

48617

add a comment |

formula = *(group)

group = element [ count ]

element = uppercase [ lowercase ]

count = "0" .. "9"

You can now give a function for each non-terminal:



// formula = *(group)   ; * becomes while (...) ...

std::list<group> parse_formula(std::stringstream& s)

{

    std::list<group> rv;



    while (!s.eof())

    {

        rv.push_back(parse_group(s));

    }



    return rv;

}



// group = element [ count ]

group parse_group(std::stringstream& s)

{

    group g;

    group.element = parse_element(s);

    try

    {

        group.count = parse_count(s);

    }

    catch (const parse_failed&)

    {

        group = 1;

    }

}



// element = uppercase [lowercase]

std::string parse_element(std::stringstream& s)

{

    if (!std::isupper(s.peek()))

    {

        throw parse_failed(...);

    }



    std::string element = s.get();;



    if (std::islower(s.peek()))

    {

        s.get(ch);

        element += ch;

    }



    return element;

}



// count = [0-9]

unsigned parse_count(std::stringstream& s)

{ 

    if (!std::isdigit(s.peek()))

    {

        throw parse_failed(...);

    }



    unsigned rv; 

    s >> rv; // this actually violates the grammar as it reads multiple digits

    return rv;

}

You can then iterate over the list of groups and compute the sum.

edited Jan 27 at 14:38

answered Jan 27 at 12:13

Cornholio

48617

formula = *(group)

group = element [ count ]

element = uppercase [ lowercase ]

count = "0" .. "9"

You can now give a function for each non-terminal:



// formula = *(group)   ; * becomes while (...) ...

std::list<group> parse_formula(std::stringstream& s)

{

    std::list<group> rv;



    while (!s.eof())

    {

        rv.push_back(parse_group(s));

    }



    return rv;

}



// group = element [ count ]

group parse_group(std::stringstream& s)

{

    group g;

    group.element = parse_element(s);

    try

    {

        group.count = parse_count(s);

    }

    catch (const parse_failed&)

    {

        group = 1;

    }

}



// element = uppercase [lowercase]

std::string parse_element(std::stringstream& s)

{

    if (!std::isupper(s.peek()))

    {

        throw parse_failed(...);

    }



    std::string element = s.get();;



    if (std::islower(s.peek()))

    {

        s.get(ch);

        element += ch;

    }



    return element;

}



// count = [0-9]

unsigned parse_count(std::stringstream& s)

{ 

    if (!std::isdigit(s.peek()))

    {

        throw parse_failed(...);

    }



    unsigned rv; 

    s >> rv; // this actually violates the grammar as it reads multiple digits

    return rv;

}

You can then iterate over the list of groups and compute the sum.

edited Jan 27 at 14:38

answered Jan 27 at 12:13

Cornholio

48617

edited Jan 27 at 14:38

answered Jan 27 at 12:13

Cornholio

48617

answered Jan 27 at 12:13

Cornholio

48617

answered Jan 27 at 12:13

Cornholio

48617

add a comment |

Lambdas Too Complex

Methods Should Calculate Or Mutate, Not Both

Variable Names Unclear

Names like pstart, num and m do not tell me anything about the purpose of these variables. There is nothing wrong with LongVariableNamesThatExplainWhatTheyAreDoing.

answered Jan 27 at 12:15

Joe C

1,045211

add a comment |

Lambdas Too Complex

Methods Should Calculate Or Mutate, Not Both

Variable Names Unclear

Names like pstart, num and m do not tell me anything about the purpose of these variables. There is nothing wrong with LongVariableNamesThatExplainWhatTheyAreDoing.

answered Jan 27 at 12:15

Joe C

1,045211

add a comment |

Lambdas Too Complex

Methods Should Calculate Or Mutate, Not Both

Variable Names Unclear

Names like pstart, num and m do not tell me anything about the purpose of these variables. There is nothing wrong with LongVariableNamesThatExplainWhatTheyAreDoing.

answered Jan 27 at 12:15

Joe C

1,045211

Lambdas Too Complex

Methods Should Calculate Or Mutate, Not Both

Variable Names Unclear

Names like pstart, num and m do not tell me anything about the purpose of these variables. There is nothing wrong with LongVariableNamesThatExplainWhatTheyAreDoing.

answered Jan 27 at 12:15

Joe C

1,045211

answered Jan 27 at 12:15

Joe C

1,045211

answered Jan 27 at 12:15

Joe C

1,045211

answered Jan 27 at 12:15

Joe C

1,045211

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Jtdylktuy