Invoice System design

Questions ArchiveCategory: ProgrammingInvoice System design
Alcides asked 7 months ago

Hi Hassan,
I’m starting in web app development, and currently I have many doubts about the design of a multicompany invoice system.
First the crucial background:
– The system is for accounting consultants so the traffic is limited to their clients, I’ve estimated that there will be about 10000 invoice records per month from all of their clients
– I have to do the accountant automatization, so every record has to be divided in many accounts, something about three to ten for each record, so the estimated is 30000 to 100000 records of accounts per month.
– Starting from the account records I have to summarize them in many ways to get the results of a certain period, mostly monthly and yearly, with the goal of automate the accountant work
– The technologies to be used are the classic stack: Apache, PHP, MySQL with Laravel as Backend Framework and in the frontend no framework, just HTML, CSS, JS, jQuery and Ajax.
My main concerns are:
– Should I worry now about handling the quantity of records? I’ve been researching and most sources says that this amount should not be a problem, if there are problems they’ll appear when the records are in the order of many millions, Is this true?, how to handle it?
– Some tecnhiques I’ve found to manage huge data are 1) database partitions (MySQL partitioning) 2) based in design like divide the records in tables for each year: invoices_2018, invoices_2019, etc. 3) tombstone tables to clear the data of the most active tables.
What could be the best approach if applies to my case?
– The database engine I’m using is InnoDB, and I want to know: should I apply transactions for every non-atomic operation?, in my research some people says that with this database engine explicitly applying transactions, speed up the queries because of the default autocommit every line, Is this true?
– Last but not least, I’ve found that there is a debate in how to organize the data, some people say that the correct approach for multicompany data is to create a set of tables for each company (this makes sense for me if the information never should be shared or centralized, but it’s not my case), and some people says that it’s totally fine to manage multicompany data with a single set of tables, since I’m sticking to the last option, I’m right?
I’d like to know how a person with experience like you, would face some of these issues, could you give me some insights before I have problems?
If I forgot to consider something about this kind of problem I’d like to know too.
Thank you so much in advance.

Alcides,
PS: English is not my native language, so sorry for the language inconsistences.

1 Answers
Hassan Aboul Hassan answered 4 months ago

Hi Alcides. and thank you very much for sharing this.
And by the way, your English is perfect, it’s better than mine ?
 
1: Should I worry now about handling the quantity of records? I’ve been researching and most sources says that this amount should not be a problem, if there are problems they’ll appear when the records are in the order of many millions, Is this true?, how to handle it?
Yes, that is absolutely true, As you mentioned in your introduction, you talked about thousands of records and these can be handled easily with database systems like MySQL, SQL server, or Postgres.
If in your plan, you feel that you may reach millions of records if your app is scaled up. then you have to handle this from now in your architecture.
And this takes us to your second concern:
2:Some tecnhiques I’ve found to manage huge data are 1) database partitions (MySQL partitioning) 2) based in design like divide the records in tables for each year: invoices_2018, invoices_2019, etc. 3) tombstone tables to clear the data of the most active tables.
Yes, based on the design in your case is a very popular solution, especially for accounting software.
Since in this type of applications, you have yearly records and usually, older records are kept as an archive or for reporting.
So for each year, you can create a database that will be closed at the end of the year and a new database will be created for the next year and so on.
you can still access the old databases also, but now in this way, you will decrease the overhead and increase transactions performance in the newly created database since it will be empty. and the old one will be used on demand only.
Data partitioning is another solution and you can apply to but you have the overhead of managing its architecture since it has somehow a different implementation.
I think also you may have some performance overhead in partitioning if you have low bandwidth since the partitions will be distributed over multiple online servers. so take care if this also in your plan.
Concerning tombstone, sorry I didn’t understand what exactly is your approach, what do you mean delete most active tables?
I worked with huge accounting systems, and we managed millions of records in SQL server without even partitioning or dissecting into years.
But for your case as in my humble opinion go for splitting your databases (in design) over years if you feel you will have very large amounts of data later.
 
3:The database engine I’m using is InnoDB, and I want to know: should I apply transactions for every non-atomic operation?, in my research some people says that with this database engine explicitly applying transactions, speed up the queries because of the default autocommit every line, Is this true?
As I know about operations, If you are updating your data (a certain record) at the same time you must use atomic. non-atomic yes may be faster but you have no guarantee of data update in the case I mentioned. So take care of this in your plan.
because if use nonatomic in this situation, you will have something called data race as in c++. You will not know exactly what data is updated. this may be very specific. If you don’t have this situation then go for non-atomic.
 
4: Last but not least, I’ve found that there is a debate in how to organize the data, some people say that the correct approach for multicompany data is to create a set of tables for each company (this makes sense for me if the information never should be shared or centralized, but it’s not my case), and some people says that it’s totally fine to manage multicompany data with a single set of tables, since I’m sticking to the last option, I’m right?
Yes, you can simply manage multi-company with one set of data, I used to do so, especially in online systems. 
Just you have to take care of some security vulnerabilities since as you know if your tables where exposed then all companies data will be exposed also. So take care of this point if you are concerned about securing your data.
 
 
Finally, I don’t know how much you are experienced in programming, maybe you are better than me. But in my experience, I found that planning for a software is very very very important, A good programmer is that person who knows how to find the best solutions and not who writes code. you may know this fact, but I wanted to mention since I really faced real problems before and wasted a lot of time because I didn’t plan well.
I Programmed more than 50 applications and some are enterprise and huge, some are mobile, and others are web apps. In each one, I had read and researched about every little detail in the software before coding. sometimes I read books or watch hours of videos to make sure I take the right choice.
And between, is this project your own? or you work for a certain company?
Please, can you send me your CV or portfolio or any information about you? maybe we can work and help each other if you interested.
 
Thanks again for your question. I will be waiting for any future posts from you.