Spreadsheet Demo With Gemini (LLM)
Evaluation of Gemini (LLM) on a traditional coding interview question.
I recently tested Gemini, a large language model, on a simplified Python spreadsheet coding task, simulating a potential interview scenario. While the process revealed areas for improvement, Gemini’s performance was overall impressive.
- The final output was quite good. It had all the basics, like correctness, tests, caching, etc., in place.
- It was good at following guidance. It made many mistakes, but it was able to correct itself iteratively. However, this happened one issue at a time. At times, it was one step forward and one step backward, introducing new errors.
- The entire exercise took roughly 40 minutes, comparable to the time allotted in typical coding interviews.
- It could not develop the overall structure after the initial prompt. This result might be my mistake; perhaps I should have asked with good, open-ended prompts. Further exploration is needed to understand how effectively LLMs can handle complex software design tasks.
After all, if this were a coding interview, I would not be impressed, but this was quite impressive for supporting coding assistance.
If you are interested, check out the final code with some minor finishing touches from me here: https://gitlab.com/-/snippets/3679180.