The 2024 has begun. To stay true to my hopes and wishes, I have to use this website as a place where I am going to document my project endeavors. I don’t consider myself an entrepreneur.. yet, but if I earn $ from anything that I do make in the future – I will consider myself worthy of that title.
We began December 2023 with a small project – an AI that generates real estate listing descriptions. The idea was very simple – feed a model with a couple of images, and get a description based off of those. Why I thought it was novel was because, at the time, none of the websites already offering this had the option to upload apartment pictures – all that you could do was describe what you were selling, and select a couple of hashtags related to your real estate.
I wanted to make it different. I wanted to allow photo uploads, and I wanted the model to focus on the materials shown in the pictures, on the natural lighting, and the surroundings.
It took me 13 days to make an MVP. Bear in mind, I have no coding background. In 2019 I did a Udemy course covering some of the Python basics because at the time, I was interested in Machine Learning / Deep Learning. I tried coding as a hobby for about a year, but since I could land a junior job after the year had passed, I dropped that hobby completely.
Four years later, I was back at it. It felt as if I never touched programming – I had completely forgotten most of the stuff I learned. Luckily, this time it was not knowledge I was armed with; no, this time, I had ChatGPT on the side. I decided I would pay a monthly fee for GPT 4 access and start bombarding it with questions daily. If nothing, I would become better at prompting.
Honestly, I thought it would take me at least a month before I could launch an MVP. I decided to focus on the backend, asking for chunks of code from GPT daily and slowly gluing it all together piece by piece. I had no idea how web frameworks worked, but thankfully, I was able to get familiar with that through ChatGPT as well. I want to slowly work you through the whole building process.
First, I needed to draw on a piece of paper the logic behind the website. It all sounded so simple:
- upload a batch of photos to a hosting website, where I would later access the URLs from the uploaded photos
- send URLs of uploaded photos to an AI model that specializes in images
- generate a description of the photos (separated or merged together), save it
- send a description variable back to the AI model, this time specialized in text generation, and ask of it to generate a sales description
- display a result to the user
For image upload, I went with Google Cloud buckets. The dashboard can look daunting to a newbie (which I was), but it was straightforward enough. I had to play around with settings in order to switch from private to public items in the bucket, other than that, everything else was fine.
I chose Microsoft Azure Computer Vision as the first AI model, the one that would read the images, as per ChatGPTs recommendation (the most popular alternatives are Google Cloud Vision AI and Amazon Rekognition). It took me a day and a half just to set up an account and get used to Azure’s dashboard. After that, I started sending the pictures to Azure’s vision AI to produce results.
I know there were a few parameters that I did not tweak in Azure, but in general, I just did not like the results that I was getting, and the descriptions the model was providing. I considered switching to Google’s AI when I realized – why not just use OpenAI’s vision model?
Why not use OpenAI for both image reading and description generation? It seemed like a logical step to take. It would reduce the complexity. And so I did just that.
After a few days of tweaking the prompts for the vision model of OpenAI, and for the real estate description generator model, I had a working project. I added sessions because I wanted to limit how many pictures a single user could upload (otherwise it would be very costly for me), and also, I wanted to limit the description generations per day, per user. The main issue was – this thing was running locally. I had NO IDEA how I could take all of the code and upload it online, make it available as a website.
After more consulting with ChatGPT and Google, I had to make changes in the code (some protected files were stored locally, that needed to be moved to the cloud), and I decided to host it on Heroku. This last part turned out to be harder than expected because I kept getting errors in my code. I had to commit and recommit to GitHub more times than I want to admit.
After 3 or 4 days of hell, I finally got the thing running.
It needed a lot of tweaking, but the main thing is – it worked. Now, it’s been a few days since I got it to work, and I am kind of in a “writer’s block” – I wanted to make the front end pretty, or rather dynamic, but there’s no way for me to do it alone. I tried with GPT, with Google, Youtube tutorials, and I still can’t figure it out. If it was me, I would want my landing page and the generation page to look something in the style of this website:
And hiring someone to do it might cost me a lot. I am not sure if I should ditch the thing completely, or if I should just stick with it, invest in an engineer or two to help me make it seamless, and then later see how I can get users for it.
That’s the main issue – I do not know if there is a market for this thing. I wouldn’t want to put $1000 to $3000 of my own money into it, if I did not get any kind of social proof that someone would use it. And another issue is, I don’t know how I would scale or advertise it. I mean, programming was not my field either, and I got it to work, so I am pretty sure I would figure it out.
As of now, I am not sure what to do. If I decide to get things into motion again, I will write a new post, or update this one. Farewell for now.