Calculate modularity of software architecture
- Authors
In this article, we explore the concept of modularity in software architecture and its importance. We'll discuss how to calculate its key components - cohesion, coupling, and connascence - and their significance in assessing architectural quality.
What is modularity?
Modularity
in software architecture refers to the practice of dividing a system into separate, independent modules that encapsulate different functionalities.
At its core, modularity involves organizing a system into cohesive units
, each responsible for a specific aspect of the overall functionality. These modules are designed to be self-contained, with well-defined interfaces for interaction with other modules
. For example, in a banking system, a module handling financial transactions could be separated from a module responsible for managing customer accounts.
Such a modular structure allows for flexible software design, testing, and development, enhancing scalability and maintainability. However, to reap the full benefits of modularity, an architect must also consider aspects such as cohesion, coupling, and connascence, which impact the efficiency and flexibility of the system. Thus, modularity becomes a crucial element in creating modern and sophisticated information systems capable of meeting dynamic market demands.
Dividing modules and establishing boundaries
Dividing a software system into modules and establishing clear boundaries between them is a crucial aspect. Effective module division and boundary establishment promote maintainability, scalability, and flexibility. Here are some strategies for achieving this:
1. Single Responsibility Principle (SRP)
Follow the SRP, which states that a module should have only one reason to change. Identify cohesive functionalities within your system and group them together into modules. Each module should encapsulate a single, well-defined responsibility. For example, in a web application, you might have separate modules for user authentication, order processing, and product management.
2. Separation of Concerns (SoC)
Apply the SoC principle to divide your system into distinct concerns, such as data access, business logic, and user interface. Each concern should be addressed by a separate module, allowing for easier maintenance and scalability. For instance, in a web application, you could have separate modules for handling database interactions, business rules, and frontend presentation.
3. Domain-Driven Design (DDD)
Utilize DDD to identify the core domains of your application and model them as separate modules. Define bounded contexts for each domain, establishing clear boundaries that prevent domain logic from leaking into other parts of the system. For instance, in an e-commerce application, you might have separate modules for managing products, orders, and customers.
4. Interface Segregation Principle (ISP)
Design module interfaces with the ISP in mind, which states that clients should not be forced to depend on interfaces they do not use. Define minimal and focused interfaces that cater to the specific needs of each module, avoiding unnecessary dependencies. This allows for better encapsulation and reduces the impact of changes on other modules.
5. Dependency Inversion Principle (DIP)
Apply the DIP by decoupling modules from their concrete implementations and depending on abstractions. Use dependency injection to inject dependencies into modules, allowing for greater flexibility and testability. This promotes loose coupling between modules and makes it easier to replace implementations without affecting the rest of the system.
When to divide modules?
Consider dividing a module when it becomes too large or when it violates the single responsibility principle (or other form the above rulesl depending of which you are using). However, avoid dividing tightly coupled modules, as this can lead to increased complexity and reduced maintainability.
Trying to split a coherent module would only increase coupling and decrease readability. ~Larry Constantine
Components of modularity
Modularity includes cohesion
, coupling
and connascence
.
Together, these three factors contribute to the modularity of an architecture, affecting its flexibility, maintainability and scalability. Understanding and quantifying these factors is essential for architects to effectively manage and control the current state of the architecture
Let's take a look at each of them!
Cohesion
Cohesion measures the degree to which elements within a module belong together
. High cohesion implies that elements within a module are closely related and contribute to a single, well-defined purpose.
Let's explore some types of cohesion, starting from the highest level.
Functional Cohesion
Methods in a module perform a common, logical task or operate on a common set of data. This is the highest level of cohesion, where all methods in the module collaborate to achieve one well-defined task.
Sequential Cohesion
Methods in a module are called in a specific sequence, where the output of one method is used as input for the next method. This is sequential cohesion, where methods are linked by the sequence of their invocations.
Communicational Cohesion
Methods in a module perform similar tasks and often exchange data among themselves. Communicational cohesion occurs when methods in a module are linked by data exchange.
Procedural Cohesion
Methods in a module perform similar tasks, but not necessarily in a logically related manner. For example, a module containing methods for data processing, where each method performs different operations on data but there's no logical connection between them.
Temporal Cohesion
Methods in a module are invoked at the same point in time, but not necessarily to perform related tasks. Temporal cohesion is when methods are linked by the time of their invocation, but not necessarily by their functionality.
Logical Cohesion
In this case, methods in a module perform related tasks that are logically connected. However, these methods may vary in terms of specific operations, which can affect code efficiency and comprehensibility.
Coincidental Cohesion
This is the lowest level of cohesion, where methods in a module perform different, unrelated tasks. It results from a random collection of methods in a module that lack logical connection. Modules with coincidental cohesion are difficult to understand and maintain.
Cohesion metrics
LCOM version 1
Lack of Cohesion in Methods measures the lack of cohesion within a class. LCOM version 1 is the simplest formula for calculating consistency. It is useful for quick and basic calculations in uncomplicated cases.
It is calculated as the difference between the number of method pairs that do not share any instance variables and the number of method pairs that do share instance variables. Mathematically, LCOM can be expressed as:
Where:
- |P| represents the number of method pairs that share at least one instance variable.
- |Q| represents the number of method pairs that do not share any instance variables.
Let's consider a class with 5 methods. Among these methods, 3 pairs of methods share at least one instance variable (|P| = 3), and 2 pairs of methods do not share any instance variables (|Q| = 2).
Substituting these values into the formula - |3| is greater than |2|, so:
In this case, LCOM equals 1. This value indicates some cohesion in the class, as there are more pairs of methods sharing at least one instance variable than pairs of methods not sharing any instance variables.
Interpretation of LCOM1 values:
- High Cohesion: If the LCOM result is positive, it means that there are many pairs of methods in the class that share at least one field. A high LCOM 1 value indicates high cohesion within the class, suggesting that methods in the class often operate on the same data and are strongly related.
Perfect cohesion
High cohesion, but some parts may operate on different sets of data.
- Low Cohesion: If the LCOM result is zero, it means that the number of pairs of methods sharing at least one field is equal to or lass than the number of pairs of methods that do not share any field. A LCOM 1 value of zero suggests low cohesion within the class, indicating that methods in the class are not strongly related and may operate on different sets of data.
Low cohesion
LCOM version 4
LCOM 4 is preferred over LCOM 1 because it considers method dependencies, providing a more nuanced understanding of method cohesion, especially in complex class structures. By accounting for all dependencies between methods, LCOM 4 offers a more accurate assessment of method cohesion, reflecting real-world software design practices more effectively.
LCOM in version 4 can be calculated using the following formula:
Where:
- is the number of methods in the class,
- is the degree of a vertex in the method dependency graph (i.e., the number of methods that the (i)-th method collaborates with),
- is the number of fields used only by the (i)-th method.
This formula takes into account combinations of vertex degrees in the method dependency graph to calculate LCOM 4 for the class.
Example
Take look at our example:
This allows us to say what the values of and are:
Based on diagram, we get:
- for
method1
:- , because method1 references
field1
,field3
, andfield4
. - , because the pair
field3
andfield4
is shared within method1.
- , because method1 references
- for method2:
- , because
method2
referencesfield2
andfield3
. - , because
method2
does not share any pairs of fields.
- , because
- for method3:
- , because
method3
referencesfield3
andfield4
. - , because the pairs
field3
andfield4
are shared withinmethod3
.
- , because
And, we can go to the calculations:
The value of for this example is . The closer the value is to zero, the better organized the code is.
Interpreting the results of LCOM 4
High Cohesion: If the result is close to zero or low, it indicates that the methods in the class are strongly related and collaborate with each other. This corresponds to a situation where most methods in the class operate on the same data fields and collaborate with each other.
Low Cohesion: If the result is high, it suggests that the methods in the class are weakly related and operate in isolation. This may suggest that methods in the class operate on different sets of data and do not collaborate with each other, leading to difficulties in understanding and maintaining the code.
Coupling
Refers to the degree of interdependence between modules. Low coupling indicates that modules are relatively independent of each other. High coupling means that a change in one module will likely necessitate changes in another, while low coupling means that modules can change independently of one another.
Afferent Coupling (Ca)
Afferent coupling refers to the number of incoming dependencies to a module or component. It indicates how many other modules or components rely on a particular module. High afferent coupling means that many other modules depend on a given module, suggesting that the module is highly reused or serves as a core part of the system.
Key Points:
- High Afferent Coupling: Indicates that a module is heavily used by other parts of the system. It often implies that the module is stable and well-tested, as changes to it can have widespread impact.
- Implication: Modules with high afferent coupling should be stable and robust because changes to these modules can affect many other parts of the system.
Efferent Coupling (Ce)
Efferent coupling refers to the number of outgoing dependencies from a module or component. It measures how many other modules a given module depends on. High efferent coupling means that a module relies on many other modules, suggesting that it may be more complex and harder to understand in isolation.
Key Points:
- High Efferent Coupling: Indicates that a module depends on many other modules. This can make the module more complex and tightly coupled to the rest of the system.
- Implication: Modules with high efferent coupling should be designed carefully to manage dependencies and reduce complexity.
Practical Example
Consider a class A
that provides utility functions used by many other classes (high afferent coupling), and another class B
that needs to use several external libraries to perform its operations (high efferent coupling).
- Class A (High Afferent Coupling): This class is central to the system and changing it can have widespread effects. It should be well-tested and stable.
- Class B (High Efferent Coupling): This class has many dependencies, making it more complex and potentially more fragile. Refactoring may be needed to reduce dependencies and simplify the class.
Coupling metrics
In the JavaScript world, there are several tools available that can measure coupling or at least analyze dependencies, including:
- Webpack Bundle Analyzer
- Nx Dependency Graph
- madge
But! We can also calculate the coupling manually, for instance, by using the CBO.
Coupling Between Objects (CBO)
Coupling Between Objects (CBO) is a metric used to measure the degree of interdependence between classes in an object-oriented system. It helps in understanding how changes in one class might affect other classes. High coupling indicates that classes are highly dependent on each other, which can reduce modularity and make the system harder to maintain.
CBO is primarily associated with object-oriented programming (OOP), but the concept of coupling is relevant to other programming paradigms as well.
In non-OOP paradigms, coupling still refers to the degree of interdependence between software modules. For instance, in procedural programming, functions or modules may have dependencies on each other, affecting modularity and maintainability.
- N - number of classes a given class is coupled with
Example and calculation
Consider a simplified e-commerce system with the following classes and relationships:
Order
Payment
Inventory
Customer
Example relationships:
Order
class usesPayment
andInventory
classes.Payment
class usesCustomer
class.Customer
class usesOrder
class.
CBO Calculation:
Order class:
- Uses:
Payment
,Inventory
- Uses:
Payment class:
- Uses:
Customer
- Uses:
Inventory class:
- Uses: None
Customer class:
- Uses:
Order
- Uses:
Abstractness
Abstractness is a metric used to quantify the proportion of abstract elements in a component or package. It provides insight into how flexible and extensible a module is.
Calculating Abstractness
Abstractness ((A)) is calculated using the formula:
Where:
- is the number of abstract types (abstract classes and interfaces) in the component.
- is the total number of types (including classes, abstract classes, and interfaces) in the component.
The abstractness value ranges from 0 to 1:
- : The component is entirely concrete (no abstract types).
- : The component is entirely abstract (only abstract types).
Practical Example
Consider a package with the following types:
- 4 concrete classes
- 3 abstract classes
- 2 interfaces
First, identify the number of abstract types :
- Abstract classes: 3
- Interfaces: 2
- Total abstract types
Next, determine the total number of types :
- Concrete classes: 4
- Abstract classes: 3
- Interfaces: 2
- Total types
Now, calculate the abstractness :
This result indicates that approximately 56% of the types in this component are abstract.
Remember
: Code with too little abstraction is difficult to maintain, and code with too much abstraction is weak, easy to break.
Connascence
Connascence is a concept introduced by Meilir Page-Jones to describe the relationship between software components based on how closely related their elements are. It focuses on the degree of correlation or interdependence between elements rather than the structure or type of dependencies.
Types of connascence
Connascence comes in different forms, from the least severe (easiest to remove) to the most severe (hardest to remove):
- Connascence of meaning: Two components share a meaning, such as using the same constant or value.
- Connascence of name: Two components share a name, such as using the same identifier or variable name.
- Connascence of position: Two components rely on elements being in a specific position or order.
- Connascence of timing: Two components rely on elements being executed or processed in a specific sequence or time.
- Connascence of algorithm: Two components rely on using the same algorithm or approach.
- Connascence of value: Two components rely on sharing or using the same value.
Difference between connascence and coupling
While both connascence and coupling relate to dependencies between software components, they focus on different aspects:
Coupling measures the degree of interdependence based on the structure, data exchange, control flow, or timing between modules. Connascence describes the degree of correlation or interdependence between elements within or between modules based on their meaning, name, position, timing, algorithm, or value.
Connascence is a more specific concept compared to Coupling.
In this article, we discussed the key elements of software architecture modularity: cohesion, coupling, and connascence. Understanding these concepts and their calculation methods is crucial for architects to manage architectural quality effectively.