"I'm looking to invest in your business, but first I'd like to have a look around your tech. Can you send over your source code and an AWS account login?"
Does this question strike fear into your heart?
When you're aiming for Series A (or any fundraising), it's only a matter of time before a potential corporate client, VC, or even prospective employee quizzes your approach to tech. It’s not uncommon for them to request some source code or documentation – and you don’t want to get caught out.
At Planes, we regularly work with clients at this stage and have built up a bank of questions that frequently come up during investor due diligence or Penetration Tests. As a general rule, it’s good to refer to the OWASP’s Top 10 list of the most common security vulnerabilities found in modern web applications. But it doesn’t cover everything.
This post should help guide your thinking when it comes to security and best practice for your team. Because, at the very least, keeping your tech in check will give you peace of mind as you grow.
Security and its governance is probably the most overlooked tech component by startups pre-Series A. Yet, security will be the most scrutinized element during any due diligence, given the investment risk it poses to your potential investor or client.
Remember: your system is never fully "secure", and it’s more than likely the measures you have in place will fall short for at least one stakeholder. But by following best practice advice and coordinating some risk assessment and process governance, you'll have a culture to build upon.
I'd be surprised if any website operated with no SSL certificate in this day and age. Chrome has done a good job of highlighting to users when a page is not secure. But it's not only the data transferred between your user's browser and your client server you need to worry about. Be sure to consider all the other paths in your system where data is transferred from one client, server, or database, checking that you have sufficient encryption between where the data starts and where it ends up (aka Encryption In Transit).
As well as Encryption in Transit, you need to think about when your data is stored (Encryption at Rest). Many popular database providers offer Encryption, such as AWS RDS, which is a simple switch on and should not require any application changes to implement.
A piece commonly owned by backend engineers, access control is the simple act of putting the blocks in place to prevent unintentional leaks of data to users who shouldn't be able to retrieve them. Let’s say, for example, a social media app's RESTful API has an endpoint for retrieving a list of the signed-in user's connections. The endpoint should not return any protected information about the users, such as an email address or payment card information.
Though seemingly obvious, it's easy to slip up when systems grow in complexity, which is why you need to consider how to monitor and moderate access control in your codebase. As a starting point, automated tests should check that certain fields are not present in your API’s endpoints. You might also want to think about a strict response schema for each endpoint, and tools such as GraphQL-shield for per-field rule building and composition trivial.
Knowing when a user issue has occurred and being able to recall exactly what happened is vital, yet trivial these days. Tools such as Sentry or Rollbar provide simple SDKs that will do a lot out-the-box and give you peace of mind as you grow. And they don’t take long to set up.
Ask yourself the question, “If a disaster happened, how soon would I know about it, and what information would I be able to gather from logs or monitoring software to help me diagnose?” Those out-of-the-box tools can notify you of an incident via Slack, sending a link to the tool that includes a breakdown of the exception. You can also see the user who experienced the issue and the sequence of events that took place before it. If you can't currently do that, then set aside a morning to get it set up. Pronto.
A brute-force attack occurs when an attacker takes a known username and writes a piece of software that rapidly iterates through potential passwords for the given username.
Applications that allow users to have short, non-complex passwords are highly susceptible to brute-force attacks. No rate-limiting also increases the risk. Rate limiting is a simple technique that prevents users from re-attempting an action more than a predetermined number of times within a set period of time. For example, if you limit the number of times a user can attempt to enter a password at login to 3 times every 5 minutes, then attackers trying to automatically uncover a password at random would not be possible within a reasonable period of time. Shameless plug alert; we have a library for this when using GraphQL, it's called GraphQL-rate-limit.
Maintaining governance over access to the platforms you use is an easy ball to drop. Bad access governance will often look like manually shared access to a team's AWS account (as opposed to setting up a new user with the appropriate permissions). If this sounds familiar then a clean-up is highly recommended.
Additionally, if you have staging environments in your AWS account alongside the production environment, it may be worth holding them in entirely separate AWS accounts. This way you can prevent access to production outright for most of your developers, managing their access to staging where necessary.
Code is the key ingredient for any software product, so it shouldn't take much convincing to prioritise its quality. We’ve found that quality comes as a default with highly effective teams, and, for the most part, is achieved with tools that provide guide rails for consistency and best practice.
Most languages have a compatible tool called a linter that prevents developers from writing code that doesn't meet a set of rules such as limiting line length. This works to increase readability and, in some cases, prevent the complexity of a given function from exceeding a certain level, thus nudging the developer to break their code down into smaller chunks. You can also use this tool to prevent code from being released unless it passes all of the rules. Linters can also often auto-fix errors, so we really don’t see a reason not to use one.
Automated testing is a contentious topic in startups, mostly down to its impact on the pace of delivery and questions surrounding value-added. As a team scales, the argument will likely dwindle and lean towards pro-testing as the team starts to realise that either a) they are releasing lots of bugs, or b) they are slowed down by a bottleneck in the manual QA process. Either way, most teams wish they'd started earlier when they look back at the amount of code they want to retrospectively write tests for. Try thinking about a "testing pyramid" when starting a discussion around automated tests. Consider what type of tests are going to be most effective for you right now to start building the culture.
The effectiveness of your architecture will likely be based on the specifics of what you are building, and the decisions you make will need to take into account the context of the problem. However, there are several areas to be conscious of when making decisions on architecture.
Microservice architecture is a bit of a buzzword in the engineering world. There are plenty of reasons why it is a great idea – just Google the term and you’ll see. What you probably won’t find are the pitfalls. They’re often hushed for those who've joined the hype-train, which is why I’m only addressing the downsides.
First up, "latency"–an inevitable factor when making a request from one service to another. If, for example, a microservice is needed as part of a user-facing API request, the microservice's latency will simply increase the overall API latency. And therefore quite often a good reason to not move functionality to a microservice
In addition to latency, we have increased system complexity. Because you've introduced more than one service, you've got more complexity in running the infrastructure (locally for development or hosted) and monitoring requests. Don't get me wrong, there are definitely arguments for a microservice reducing complexity through decoupling pieces of a system, but you should also consider the developer experience.
Ultimately, you should do some research, consider the pros and the cons, and factor in some reflection with your team.
An investor or client will lose trust if your system is not reliable. As your tech needs to scale, its availability will likely have a good impact on its reliability. Building a highly available system is relatively trivial with most of the leading cloud providers. For example, if you are using AWS you can spin up your server in multiple availability zones and use a load balancer to manage the traffic.
You wouldn't go ahead and start building a skyscraper without the required safety procedure, cranes, diggers and site office. Similarly, you'd be unwise to go ahead and build a high-growth tech project without solid tools, processes and automation. And it doesn't need to be difficult!
In systems that depend on specific hardware and cloud configuration, making changes using Infrastructure as Code should be a high priority given the safety it provides. After all, as your team scales to multiple engineers with the shared responsibility of creating/updating resources to your AWS account, it’s important for everyone to know what changes were made, when they were made, and why they were made. In addition, you need to be able to reverse changes if a service becomes unavailable, which can only happen if you have an accessible log of the changes made that you can step through in reverse.
Whilst there might be a steep learning curve with some of these tools, I can't emphasise enough the value for systems that need lots of services and frequent updates. Keeping your infrastructure as human-readable code will also allow your team to perform peer code reviews on the changes before they are automatically released, affording a sense check before the change makes its way into the wild.
Although disasters should be rare, the consequences can be damaging. Because of this, it’s a good idea to plan for some of the more likely scenarios, such as a server shutting down or your database becoming corrupt (or even disappearing completely). Planning for these cases will likely help to unearth some resolvable issues in your systems, such as making sure you have frequent hard database backups, working load balancers, or even point-in-time recovery of your database.
Automating your manual processes can go a long way. As well as reducing tedious repetitive tasks it can also help in:
Security You don't need to share keys and env files with developers as they are all secured in your CI/CD tools
Productivity You can automate tests before automatically releasing work. This type of task may have required a full-time QA before
Reliability Most bugs occur at the point of release and manually releasing provides more potential for human error
Scalability Increasing automation reduces the manual burden, letting developers get on with iterating
It's never too late to make any of the positive changes recommended here. The sooner you implement them, the less cumbersome it is for your team, systems, and process.
Of course, a balance is needed to remain lean and agile, whilst preparing for growth and staying efficient. But I can tell you from experience working with lots of pre- and post-Series A companies over the past years, that it’s rare to regret implementing any of the above.