Pitch to Perspective
Link to the Github repo
Entry #1: The Origin of an Idea
The concept for this project first struck me during the summer of 2023, just before my senior year as a physics major. While browsing job postings, I came across an internship opportunity at a venture capital firm. The position focused on developing an app capable of processing a PDF pitch deck using AI, automatically generating a summary, and producing an investment memo.
I was captivated by the potential of this technology. The idea of automating parts of the decision-making process for venture capitalists felt both innovative and practical. However, despite my curiosity, I didn’t apply. The job description listed several technical prerequisites I didn’t yet possess, and with the demands of my physics coursework and leadership responsibilities, I couldn’t afford to pursue it further.
As the academic year progressed, the idea faded into the background. Senior year was a balancing act of advanced classes, research projects, and campus commitments. After graduation, I turned my attention to the next phase of my life: preparing for the GRE, exploring graduate programs, and submitting applications. The project idea became a distant memory, overshadowed by these more immediate concerns.
Everything changed at the start of the new year. With my applications submitted, I suddenly found myself with free time while waiting for responses. Seeking a creative outlet, I began experimenting with the latest AI-driven IDEs. The experience was eye-opening. Although I had some programming experience from my work in high-energy physics (mainly with C++), these new tools operated on a different level entirely. They allowed me to quickly prototype projects simply by providing prompts. I created everything from browser extensions to game modifications, and each success fueled my enthusiasm.
Amid this creative exploration, the old idea from that job listing resurfaced. It lingered in my mind, prompting me to wonder whether anyone had developed something similar. A quick Google search yielded no significant results. That’s when it hit me—there was a gap to fill, and I had the chance to fill it.
With a renewed sense of purpose, I committed to building a prototype for the app. This journal will serve as a record of my journey—the breakthroughs, setbacks, and lessons—as I work to transform that initial spark into a fully functioning product.
Entry #2: December 29, 2024
I decided to begin the front-end portion of the project using V0 by Vercel. After laying down the initial structure, I moved the codebase into Cursor to further experiment with different layouts and component designs. Although I hadn’t yet mapped out a concrete plan, I dove into refining the user interface. Most of the code was written in TypeScript, and I quickly realized I would need to address potential problems—such as state management and data flow—in the near future. For the moment, though, I was more focused on getting the visual elements in place.
By the end of the day, I had built three core pages. The first was a landing page where users could upload their pitch decks. The second was a results page designed to display AI-generated memos. Lastly, I created a page to show previously generated memos. None of the features were actually functional yet; it was purely front-end work without any back-end integration.
Still, it felt good to see a tangible outline of how the app might look. Before wrapping up, I created a GitHub repository and uploaded the code. Although I have plenty of unresolved technical questions, I’m looking forward to tackling them once I’m satisfied with the UI.
Entry #3: January 1, 2025
I didn’t accomplish much today in terms of functionality; I mainly experimented with visual elements to refine the overall look and feel of the web app. In a burst of ambition, I briefly toyed with the idea of implementing an analytics page where venture capitalists could track their investments and performance metrics. My vision is to create a one-stop solution for VC firms, but I quickly realized this feature isn’t an immediate priority.
Despite that, I pushed the work-in-progress to my GitHub repository. Even if the analytics component won’t make it into the app anytime soon, it was helpful to flesh out the concept. I plan to revisit it later once I have the core features in place. For now, I need to focus on the app’s fundamental functionality and ensure that everything runs smoothly before expanding into analytics or other supplementary features.
Entry #4: January 25, 2025
It’s been a while since I last touched the codebase, largely because I was juggling deadlines for another round of grad school applications. Today, I managed to implement the upload feature, making it possible for users to actually submit their files to the front end. During this process, I also learned a great deal about hosting on Vercel, which should streamline deployment and testing in the future.
I’m glad to have made some progress after a bit of a hiatus, and I’ve pushed these updates to GitHub to keep everything organized and backed up. I’m looking forward to picking up the pace again, now that my schedule has cleared up a bit.
Entry #5: February 1, 2025
I made some significant changes to the app today by transitioning its core functionality to Python, a language I’m already somewhat familiar with. Although I had to abandon the initial front-end work, this shift should make it easier to integrate the features I have in mind.
On the bright side, I managed to implement several key components this time around. First, I improved the upload feature, allowing users to submit PDFs that can be processed by the AI model. Second, I added the capability to parse text from those PDFs, preparing them for analysis. Finally, I introduced the function that passes the extracted data to the AI model. Currently, I’m using Facebook’s BART model for summarization.
While it’s unfortunate that the previous front-end work had to be left behind, I feel this new approach will give me the flexibility and efficiency I need to move forward. I’ve updated the GitHub repository to reflect these changes. From here, I’m excited to refine the model interactions and eventually revisit the idea of a polished interface once the core logic is fully established.
I have been using the new o3-mini-high to help me with coding, and it has been amazing.
Entry #6: February 4, 2025
As I began testing the app with real-world pitch deck PDFs, I noticed a number of errors and inconsistencies. My first suspicion was that the PDF parsing wasn’t producing readable text, so I added a feature that allowed me to view the parsed output directly. Sure enough, there were unwanted spaces and page breaks scattered throughout.
To address this, I wrote a function to clean and format the parsed text before passing it to the AI. This improved the quality of the generated memos somewhat, but I still wasn’t fully satisfied. Eager to see better results, I experimented with different models on Hugging Face. While I haven’t updated the code to leverage a new API yet, I decided to switch to Google’s FLAN-T5-Base model moving forward.
This was another big step, and although there’s still plenty of fine-tuning to do, the parsing and summarization process feels more robust than ever. Once I integrate the new model, I’m hopeful that the memos will reach a higher level of readability and usefulness. I’ve pushed these updates to GitHub.
Entry #7: February 6, 2025
Today was all about refining the text extraction process and handling the wide variety of pitch deck formats that real-world PDFs can throw at us. After noticing that some decks were primarily images or included scanned pages, I implemented a fallback parsing approach: the app now tries PyPDF2 first, then pdfplumber, and finally OCR if both methods fail to return clean text. This ended up being a significant upgrade to the robustness of the system.
To support OCR, I had to install Tesseract on my machine (and Poppler for Windows users) and configure the app to rely on those tools. That part was a bit tricky—especially making sure the binaries were in the system’s PATH—but it was worth it. Now the app can handle even the most stubborn image-based pitch decks.
The other big focus today was cleaning up the parsed text. I added a new function for advanced spacing fixes, which helps join words and letters that the parser sometimes outputs with extra spaces. Then, I integrated more targeted noise removal to filter out headers, footers, and repeated lines. Finally, I included better handling of punctuation and optional sentence segmentation to make the text more readable. Although there are still occasional oddities—like missing spaces around punctuation—the output is leaps and bounds better than before.
Seeing how far the text extraction and formatting have come is really satisfying. Next up, I plan to revisit the AI model itself. My hope is that, with cleaner text as input, switching to a more advanced model (like a future GPT-4 or GPT-4o variant) will yield professional-grade memos. I’ll push all these updates to GitHub so that others can see how the fallback parsing and improved cleaning pipeline were implemented.
Entry #8: February 7, 2025
More experimentations with AI models today. I shifted gears by changing our API integration from FLAN-T5 to DeepSeek. I updated the code to send requests to DeepSeek via OpenRouter, adapting the payload to match the chat-based format the model expects. Alongside this switch, I dove into understanding the interplay between context limits and output token limits. Although the overall context for our providers is massive (up to 164K tokens), I discovered that some providers (like Azure) impose a much stricter output cap (around 4K tokens). This realization led me to design our prompts with worst-case limits in mind, ensuring that the generated memo isn’t unexpectedly truncated. Today's progress has sharpened my focus on balancing prompt size and output quality—a critical step toward building a robust pitch deck analyzer.
Entry #9: February 8, 2025
Today I made significant strides in improving the reliability and user experience of our pitch deck screening application. I tackled one of the biggest challenges with large AI models: hallucinations, where the model confidently generates incorrect information. Recognizing this risk, I introduced a data validation feature to ensure the generated memos remain accurate.
Initially, I experimented with automatic validation—appending a validation summary to the memo. However, I found this approach too rigid: it often missed the context analysts truly cared about and cluttered the memos with irrelevant references. Instead, I opted for a user-in-the-loop approach. Now, analysts can highlight specific text in the memo and click a “Validate Selection” button, which triggers a query to Google’s Custom Search. I have also configured the custom search engine to focus on trusted sources (like cbinsights, stastistica, pitchbook, etc), reducing the chance of irrelevant or inaccurate data creeping into our validation results.
I also devoted time to UI/UX refinements. I extracted inline styles and scripts into their own files, making the codebase cleaner and more maintainable. The search results now appear in user-friendly cards, styled with Tailwind CSS, giving the validation feedback a more polished look.
These updates, combined with yesterday’s shift to DeepSeek as our AI model, position us well for delivering a more robust pitch deck analyzer. By blending strong AI capabilities with carefully designed user interactions, we are striking a balance between automation and human oversight—critical in a domain where both efficiency and accuracy matter deeply.
Entry #10: February 9, 2025
Huge updates today—finally, a major structural milestone has been reached: I’ve successfully separated the frontend from the backend. This decoupling has been a long time coming and marks a significant step toward a more modular and maintainable architecture.
I decided to build out the new frontend in TypeScript, and while I was excited about the fresh possibilities this approach offers, the implementation turned out to be more challenging than anticipated. The task required me to transform our once monolithic Flask app into a dedicated API backend. This meant stripping away the parts tightly coupled with the UI and ensuring that the backend could serve data seamlessly to the new front-end interface.
One of the most frustrating—and completely unexpected—issues I ran into was a clash between Apple’s AirDrop and Handoff services. Both were vying for the same local ports that our Flask API depended on. It took hours of head-scratching and iterative testing to pinpoint and resolve this port conflict. Once I finally rerouted the conflicting services and locked down the necessary configurations, the API began to function as intended.
On the frontend side, the transformation is striking. The redesigned landing page, crafted in TypeScript with modern UI principles, now looks polished and professional. The visual upgrade isn’t just skin-deep; it sets a solid foundation for a more dynamic and interactive user experience. With these updates, the application not only runs smoother but also feels more in line with contemporary web app standards.
I pushed all these changes to GitHub, and though there’s still plenty to refine, today’s progress has boosted my confidence. It’s encouraging to see the project evolve, and I’m eager to continue enhancing both the functionality and the aesthetics of the app in the coming days.
Entry #11: February 10, 2025
More landing page UI changes. I might have a name for the app: KECHA
Entry #12: February 11, 2025
Changed the prompt for the Llama model that cleans the extracted text. I am getting busier again as I have a few grad school interviews coming up, had Notre Dame today, went well (I think).
Entry #13: February 13, 2025
The original Llama model that we were using wasn’t consistent enough. So I changed the AI model that was processing the raw text to nvidia/llama-3.1-nemotron-70b-instruct model and integrated some code to separate the COT from the generated memo. Interviewed for Fordham and got an offer for MS Finance from University of Rochester.
Entry #14: February 14, 2025
Refactored CORS settings and updated text processing logic for improved formatting. Happy Valentines day <3
Entry #15: February 15, 2025
Added a loading page for a better user experience, and added refactor text handling in edit and results components. Additionally, I am looking at the latest academic papers that use ML or data analysis for predicting startup success to narrow down the variables I should take into account when giving the deck a final score out of 5. Nothing crazy yet, but twitter and linkedin followers are surprisingly an excellent indicator in these studies. The only concern I have on that stems from the fact that these studies are probably looking at already established startups with decent funding. The startups that didn’t make it despite having good socials would not continue having an online presence, so I suspect some degree of bias. (Needs further study on my part, I am not trying to invalidate an entire paper)
Entry #16: February 16, 2025
Update CORS settings for production and adjust API URL handling in frontend components, Merge branch 'master' of https://github.com/akassh9/pitch-deck-analyzer, Enhance CORS configuration and improve loading page timeout handling, Remove unused PDF example from uploads directory.
Entry #17: February 24, 2025
Removed Token Limits for when, groq cloud is being called, Update .gitignore to include .env.local and enhance API key loading messages for better clarity, Exposed APIs handled, and token limit adjustments.
Entry #18: February 27, 2025
Refactor validation response handling and improve text processing utilities, Refactor validation response handling and improve text processing utilities.
Entry #19: March 1, 2025
Today was focused primarily on improving backend stability and security. I implemented global error handling and logging in app.py
to better capture and understand failures, making debugging significantly easier. Additionally, I refactored the configuration management into a centralized Config class and integrated Redis, laying groundwork for efficient job processing. Finally, I enhanced cross-origin resource sharing (CORS) support, adding configurations specifically tailored for Vercel and Render frontend deployments. It feels satisfying to see the backend becoming increasingly robust and secure. All these improvements have been merged via pull requests and pushed to GitHub.
Entry #20: March 2, 2025
A productive session today, focusing on both backend logic and frontend interactions. I successfully integrated Redis for background PDF processing, significantly streamlining job management and handling larger batches of data. Additionally, I implemented the investment memo generation service and updated the API endpoint, allowing seamless communication between frontend and backend. To further enhance usability, I added the validation results component, ensuring analysts could easily verify generated memos. I also refined the UI by adding markdown stripping functionality and updated the textarea styling to use a serif font for readability. Finally, I updated the README to reflect these changes clearly.
Entry #21: March 5, 2025
Today was a significant step forward in improving user customization capabilities. I introduced support for customizable investment memo templates, allowing analysts to tailor memo outputs to their specific needs. I also spent considerable time refactoring frontend components and enhancing the API integration to ensure reliable and improved text processing and validation. These updates position the app closer to a genuinely customizable and user-focused experience.
Entry #22: March 6, 2025
Today involved significant refactoring to optimize the codebase. I cleaned up the PDF processing logic by removing unused functions, updating import paths, and enhancing text refinement logic to further improve parsing accuracy and readability. Additionally, I improved configuration management by consolidating various config imports into a new centralized configuration module, significantly streamlining future updates and maintenance. Finally, I enhanced text processing and job management with more robust result handling, ensuring smoother operations and clearer outputs. Pushed all updates to GitHub, maintaining momentum and clarity in the project's progress.