Instantiate the object of a Map only when necessary based on a regex

2

I have a Factory that instantiates the PARSER responsible for extracting information from a particular invoice format. To determine the invoice format I use regex. I loop every regex added to Map and check if the regex is found inside the text of that invoice, if found, then instantiate the PARSER responsible for extracting the data from that invoice.

For this, I created a Map of Class and String, where String is the regex. It was necessary to use Class because I can not instantiate this PARSER before knowing if it really is itself that it will be PARSER responsible because there are some logics in the constructor of each PARSER .

public class InvoiceParserFactory
{
    private static final Logger logger = LoggerFactory.getLogger(InvoiceParserFactory.class);

    private static Map<Class<? extends InvoiceParser>, String> map = new HashMap<>();

    static {
        map.put(xxx_PARSER.class, "regex_xxx");
        map.put(yyy_PARSER.class, "regex_yyy");
        map.put(zzz_PARSER.class, "regex_zzz");
        //...
    }

    public static InvoiceParser getParser(InvoicePdfReader reader)
    {
        String invoiceText = reader.getText();

        for (Map.Entry<Class<? extends InvoiceParser>, String> entry : map.entrySet()) {
            Class<?> clazz = entry.getKey();
            String regex = entry.getValue();

            Pattern p = Pattern.compile(regex);
            Matcher m = p.matcher(invoiceText);

            if (m.find()) {
                try {
                    Constructor<?> constructor = clazz.getConstructor(String.class);
                    return (InvoiceParser) constructor.newInstance(invoiceText);
                } catch (Exception e) {
                    logger.error("Can't instantiate parser.", e);
                }
            }
        }

        throw new InvoiceException("Can't find parser.");
    }
}

Doubt

Is this the best practice for this purpose? Particularly I found a gambiarra. Any suggestions to improve / refactor the code? Maybe using Java 8 features?

Problem

In the future, I will need to regex, to perform some other condition, so I would have to completely refactor this Factory to support that in addition to a regex, it is possible to use other logic to instantiate the correct PARSER or something similar. What is the best practice in this case?

    
asked by anonymous 08.05.2018 / 14:51

1 answer

2

I do not think gambiarra, although I have my personal opinion regarding the use of reflection in production (which should be moderate, which in your case I think it is). But this is just my opinion. Let's go to the code:

There is a clear limitation on the code, which is that all parsers are required to have a constructor that receives a String (since you are using clazz.getConstructor(String.class) ). If you take this care, there will be no problem.

If in the future you have other logic besides regex to build the parser, I suggest you create one or more classes that encapsulate this logic:

public class LogicaCriacaoParser {
    public boolean condicoesSatisfeitas() {
        // retorna "true" se todas as condições para criar o parser são satisfeitas
        // pode ser a regex, e o que mais você precisar
    }
}

In this case your map would have instances of LogicaCriacaoParser (or subclasses of it, if you prefer):

private static Map<Class<? extends InvoiceParser>, LogicaCriacaoParser> map = new HashMap<>();

static {
    map.put(xxx_PARSER.class, new LogicaCriacaoParser());
    map.put(yyy_PARSER.class, new Subclasse1LogicaCriacaoParser());
    map.put(zzz_PARSER.class, new Subclasse2LogicaCriacaoParser());
    //...
}

So in each subclass of LogicaCriacaoParser you could implement your specific logic to create each parser (within the condicoesSatisfeitas() method), it could be just regex, or regex + whatever criteria you need.

The good thing is that you can create whatever logic you want within this class (and its subclasses), and your model is not bound by a single logic such as regex.

To use it would be something like:

LogicaCriacaoParser logica = entry.getValue();

if (logica.condicoesSatisfeitas()) {
    // cria o parser
}

Alternative (without the map, nor reflection)

Actually you do not even need to have the map, and you can also eliminate the use of reflection.

The class LogicaCriacaoParser itself could have a method that returns the specific parser if the condition is satisfied. If you prefer, you can leave the abstract class and let each subclass take care of your particular case:

public abstract class LogicaCriacaoParser {
    // retorna "true" se todas as condições para criar o parser são satisfeitas
    // pode ser a regex, e o que mais você precisar
    protected abstract boolean condicoesSatisfeitas();

    // possui a lógica para criar o parser
    protected abstract InvoiceParser criarParser();

    // cria o parser se as condições forem satisfeitas
    public InvoiceParser getParser() {
        if (condicoesSatisfeitas()) {
             return criarParser();
        }
        // retorna null se as condições não forem satisfeitas
        return null;
    }
}

Note that only the getParser() method is public because it is the only method that other classes need to know. With this, you need to have subclasses that know how to create a specific parser:

// logica para criar XXX_Parser
public class LogicaCriacaoXXX_Parser extends LogicaCriacaoParser {
    protected boolean condicoesSatisfeitas() {
        // verifica as regras para criar o XXX_Parser (regex, etc)
    }

    protected InvoiceParser criarParser() {
        // retorna o XXX_Parser (usando new ao invés de reflection)
    }
}

So you create a subclass for each different parser (yyy, zzz, etc.), each implementing its specific logic.

So you would just have a list of all the parser creation logics:

List<LogicaCriacaoParser> logicas = new ArrayList<>();
logicas.add(new LogicaCriacaoXXX_Parser());
logicas.add(new LogicaCriacaoYYY_Parser());
// etc.

And to create a parser, you loop this list until you find some parser that satisfies the conditions:

for (LogicaCriacaoParser logica : logicas) {
    InvoiceParser parser = logica.getParser();
    if (parser != null) {
        return parser;
    }
}

// se não encontrar nenhum, lança o InvoiceException

Of course there are some improvements, such as changing the constructors of each logic class to receive parameters, the methods for checking the conditions themselves and creating the parser can also receive parameters (which could be used to construct the parser, for example) , etc.

But at least you're not using reflection (which by the way you did not feel comfortable using, and I understand you), and you do not get dependent on each parser subclass having a specific constructor that gets a String (already that each subclass of logic can create the parser using new ).

    
09.05.2018 / 14:25