Early Detection of Lung Metastases in Breast Cancer Using YOLOv10 and Transfer Learning: A Diagnostic Accuracy Study

Hakan Gokalp Taş; Mehmet Bilge Han Taş; Eyyup Yildiz; Sonay Aydin

doi:10.12659/MSM.948195

09 September 2025: Database Analysis

Early Detection of Lung Metastases in Breast Cancer Using YOLOv10 and Transfer Learning: A Diagnostic Accuracy Study

Hakan Gokalp Taş

^{ABCE 1*}, Mehmet Bilge Han Taş

^{CEF 2}, Eyyup Yildiz^{CEF 2}, Sonay Aydin

^{AE 3}

DOI: 10.12659/MSM.948195

Med Sci Monit 2025; 31:e948195

Authors information Article notes Copyright and License information

0 Comments

Add Comment

Abstract

0:00

BACKGROUND: This study used CT imaging analyzed with deep learning techniques to assess the diagnostic accuracy of lung metastasis detection in patients with breast cancer. The aim of the research was to create and verify a system for detecting malignant and metastatic lung lesions that uses YOLOv10 and transfer learning.

MATERIAL AND METHODS: From January 2023 to 2024, CT scans of 16 patients with breast cancer who had confirmed lung metastases were gathered retrospectively from Erzincan Mengücek Gazi Training and Research Hospital. The YOLOv10 deep learning system was used to assess a labeled dataset of 1264 enhanced CT images.

RESULTS: A total of 1264 labeled images from 16 patients were included. With an accuracy of 96.4%, sensitivity of 94.1%, specificity of 97.1%, and precision of 90.3%, the ResNet-50 model performed best. The robustness of the model was shown by the remarkable area under the curve (AUC), which came in at 0.96. After dataset tuning, the GoogLeNet model’s accuracy was 97.3%. These results highlight our approach’s improved diagnostic capabilities over current approaches.

CONCLUSIONS: This study shows how YOLOv10 and transfer learning can be used to improve the diagnostic precision of pulmonary metastases in patients with breast cancer. The model’s effectiveness is demonstrated by the excellent performance metrics attained, opening the door for its application in clinical situations. The suggested approach supports prompt and efficient treatment decisions by lowering radiologists; workload and improving the early diagnosis of metastatic lesions.

Keywords: Breast Neoplasms, Disease Progression, lung neoplasms, machine learning, Tomography, X-Ray Computed, Humans, Female, Middle Aged, Retrospective Studies, Early Detection of Cancer, Deep Learning, Sensitivity and Specificity, adult, Aged

Introduction

MOTIVATION AND BACKGROUND:

One of the main causes of cancer-related fatalities globally is breast cancer, a condition for which prompt diagnosis and early detection are crucial. Determining the disease’s stage, choosing the best course of treatment, and improving patient survival rates all depend on early identification of cancer and its metastases. However, because of the intricate structure of medical imaging data and the delicate features of metastatic tumors, conventional approaches are frequently inadequate in accurately identifying malignant and metastatic locations. To get around these issues, our study employed sophisticated deep learning algorithms. CT images from 16 patients were labeled to identify malignant and metastatic regions, yielding a total of 1264 labeled CT images. The YOLOv10 algorithm was used for detection procedures, and images with a specific confidence interval were added back into the dataset. High-performance outcomes and more accurate metastases detection were made possible by this method. The final results were obtained after this dataset was assessed using the transfer learning method. Through the development of a more efficient technique for the early detection of breast cancer and its metastases, this research seeks to enhance clinical procedures as well as patient outcomes.

Material and Methods

PRE-PROCESSING:

The pre-processing process of images for the YOLOv10 model was carefully designed to identify metastatic lung lesions. The CT images in the dataset were first converted to gray level and made binary with a threshold value of 100. Then, all contours were determined and the largest contour was selected to highlight the areas where lung metastases were located. In this process, an automatic segmentation approach was used, and unnecessary regions were filtered to help YOLO detect metastatic regions more precisely. The mask created based on the largest contour was applied to the original image and regions outside the metastatic area were removed. Thus, the model was focused only on the relevant regions and it was aimed to achieve higher accuracy with limited data. This masking method enabled the determination of metastatic regions in radiological images and improved the learning process of the model. In addition, this process prevented YOLO from learning unnecessary regions, allowing high performance with less data. These details were added to the data pre-processing steps to increase the reproducibility of the study. All pre-processing steps were implemented in Python using the OpenCV and scikit-image libraries. For masking, the largest lung contour was detected using OpenCV. Lung areas were segmented using thresholding techniques with a fixed threshold, then refined by morphological procedures like erosion and dilation with kernel sizes of 3×3 and 5×5. Even with a restricted dataset, these approaches ensured consistent pre-processing and enabled model training.

Using image processing techniques, unwanted portions of the picture were eliminated. By doing this, the training data produced improved results since deep learning processes can eliminate unnecessary or undesired elements. To avoid this and have a better learning experience, several procedures must be followed. Figure 2 shows the pre-processing study that was carried out. Before segmenting the image to make sure that only the lung tissue regions are visible, the image was first masked and extraneous information outside the body was removed. To make adjustments, threshold values disparities were employed. In this way, it was possible to separate the color contrasts in the original image from each other. Therefore, a significant difference was created on the real image with the thresholded image value without any distortions in the image. Then, the lung parts were separated from each other by detecting the same color tones on the masked image with the colors close to black in the real image. Figure 2A is the original image, Figure 2B is the masked image, Figure 2C is the masked and thresholded image created for segmentation, and Figure 2D is the segmented version.

DEEP LEARNING AND YOLOV10:

YOLO is a deep learning technique that uses convolutional neural networks (CNN) to recognize items in a picture and helps differentiate between different objects [29]. Deep learning is used to identify the spots on the pixel that require detection and to extract the region to be detected. After that, it frames this area and draws conclusions about its nature. Due to its rapid operation and training, it is frequently used in real-time operations. Labeled data can be subjected to detection and segmentation processes using the YOLO architecture. Better region detections can be achieved because of the weighted files of the method to be used, which speeds the process and provides high-rate mAP results. Furthermore, a larger dataset can be produced and a high-performance study can be targeted by adding the photographs derived from the YOLO algorithm’s outcomes.

Dual label assignments avoid post-processing (Non-Maximum Suppression) (NMS) by assigning only 1 prediction to each true label, as opposed to the “one-to-many” approach that allows several predictions to be assigned to a true label. However, because of inadequate supervision, this method can result in low accuracy and a slower rate of learning [30]. The “one-to-many” assignment technique can make up for this shortfall [31]. To accomplish this, we enhanced YOLO models with double label assignments, which integrate the most advantageous features of both approaches. In particular, YOLO was enhanced with an additional one-to-one head, as illustrated in Figure 3. While keeping the same structure and optimization objectives as the original one-to-many branch, this head assigns labels using the one-to-one matching technique. For the neck and backbone to benefit from the supervision that the one-to-many assignment provides, the 2 heads are jointly optimized with the model during training. Only the one-to-one head is utilized during the prediction phase; the one-to-many head is turned off. As a result, there is no extra prediction cost when fitting YOLO models from beginning to end. The foundation of the consistent matching metric is a metric that quantifies the degree of agreement between instances and predictions in assignments, both one-to-many and one-to-one. A common matching metric is used to offer prediction-driven matching for both branches.

In this metric:

A classification procedure based on artificial intelligence was used in the investigation. This can be accomplished in a variety of ways. Numerous techniques, including transfer learning and deep learning and its sub-branches convolutional neural networks (CNN), have been employed in recent years. These techniques are widely utilized because they produce excellent results, particularly when applied to photos. The study’s scope included the utilization of transfer learning. Another CNN-based training technique is transfer learning. It differs from a standard CNN in that it is favored because to its pre-trained model, excellent performance, and shorter turnaround times. Figure 4 shows a classic CNN architecture.

An input layer receives data in CNN models. After that, the convolutional layers’ features are taken out and moved to layers, known as pooling. Here, the objectives are to reduce the amount of data that can contain more features and to get better results more quickly. Following these procedures, a classification process is carried out in the flattening levels, and the output layer displays the classifier’s chosen outcome. The convolution layer moves a window of a specific size in the input data to determine the outcome of the filter (kernel) multiplication. This procedure helps to emphasize specific features and lowers the dimensionality of the incoming data. The activation layer performs a non-linear operation on the network’s outputs by passing the convolution layer’s output through a specific activation function (such as ReLU). The input data’s dimensionality is further decreased by the pooling layer. By taking the average or maximum of the features inside a given window size, this layer reduces the dimensionality of the data. Layers that are fully connected transfer the findings to the following layer after converting the entire feature map into a single vector. These layers are combined to create CNNs’ mathematical formulation. Specifically, a SoftMax activation function is typically used to produce the CNN’s output as a classification result. Convolution procedures, for instance, compute the result (Y) of multiplying the input data (X) by a window (W) of a specific size. The mathematical formulation of a CNN can be rather complex; however, the fundamental operations are typically represented as follows [32]:

Equation 2 describes a convolution operation, which is a fundamental technique used in image processing and deep learning. In this operation, a filter (W) is slid over an input matrix (X) to generate an output matrix (Y). The input matrix (X) represents the data to be processed, such as an image, while the filter (W) is typically a smaller matrix containing weights designed to extract specific features from the input. For each position in the input matrix, an f×f window of the data is selected, and the elements of this window are multiplied element-wise with the corresponding elements of the filter. The results of these multiplications are summed, and this total is stored as the value of the output matrix at the current position. In this formula, i and j denote the coordinates of the top-left corner of the region currently being processed in the input matrix, while m and n represent the row and column indices of the filter. The size of the filter is given as f, indicating both its height and width. By sliding the filter across the entire input matrix, the operation captures meaningful patterns or features, such as edges or textures, which are then represented in the output matrix. This technique is commonly used in image processing tasks, such as edge detection, and serves as a core component in convolutional neural networks (CNNs), where it plays a critical role in feature extraction and pattern recognition.

TRANSFER LEARNING:

In recent years, transfer learning has grown in importance in machine learning research. This method involves using a model that has already been trained for a comparable task. When dealing with issues involving small datasets or inadequate computing power, this method can be quite helpful. There are various approaches to transfer learning, including combining the outputs of a previously trained model or building a new model using a subset of the previously trained model. Research has indicated that transfer learning is especially successful in the domains of natural language processing, object identification, and image classification. Transfer learning was found to perform better on several tasks in a multi-criteria study. For this reason, transfer learning is considered an important technique in machine learning research [33–36].

We describe a model designed to automatically differentiate between images taken for the detection of cancer and metastases. The model’s transfer learning approach serves as the foundation for the framework. A model that has been trained on a source task and source domain applies its knowledge to a separate but related target task and target domain in a process known as transfer learning. Transfer learning is widely used to improve performance and expedite the training of deep learning models. Three fundamental elements can be used to illustrate the transfer learning process in deep learning: Knowledge transfer, source model, and destination model. To learn generic features, a deep learning model is often trained using a large dataset. While the final layers of these models are task-specific, the initial layers typically learn relatively broad properties. The knowledge acquired by the source model is applied to the target task after training. The target model includes the early layers of the source model, which are fixed (frozen). In addition to the broad features from the source model, the target model learns specific information needed for the target task. Deep learning models become more versatile through transfer learning, which enables them to swiftly and effectively adjust to a variety of tasks. Here, a pre-trained deep neural network and a bespoke neural network are combined to form the framework model. Initially, the pre-trained neural network receives the data as input. The neural network component discovers the data’s low- and mid-level patterns. The custom neural network, which is the second part of the framework model, receives these learned patterns as input. In this case, the goal of the custom neural network is to identify the unique, problem-specific, high-level patterns in the data and classify the data as output.

The created frame model is given in Figure 5. The frame model is created based on the transfer learning method. In this context, the frame model is a combination of a pre-trained deep neural network and a custom neural network. First, the data are sent as input to the pre-trained neural network. The pre-trained neural network learns the low- and mid-level patterns of the data. These learned patterns are given as input to the second part of the frame model, the custom neural network. Here, the custom neural network aims to learn the problem-specific distinctive high-level patterns and assign the data to the correct class as output.

Commonly used convolutional networks work successfully on many different types of data. In this study, convolutional networks’ performance on data was investigated. ResNet18, ResNet50, DenseNet121, DenseNet161, EfficientNet-B0, EfficientNet-B1, and GoogLeNet network structures were used in the proposed framework model. The last layers (classification layers) of these pre-trained network models that learn problem-specific high-level features were not used. Instead of these unused layers, a custom neural network was added to the framework model as the second component. The two-layer, linear structure of the custom neural network was constructed. Thus, the framework model consisting of 2 sub-components learns high-level features faster by taking advantage of the power of learned patterns of known classical convolutional networks. In addition, the fact that the proposed model is in a framework structure provides the advantage of adding other unused pre-trained architectural structures to the model with transfer learning.

DENSENET:

DenseNet, or densely connected convolutional networks, is an architecture designed to maximize parameter efficiency and boost deep neural networks’ learning capabilities. The model’s primary innovation is using thick connections to link each layer’s output to all future levels. By preventing information loss and facilitating gradient propagation throughout the network, these links lessen the “vanishing gradient” issue that arises, particularly in deep networks. DenseNet facilitates improved information exchange by providing direct access to both high-level and low-level characteristics at every layer. However, the quantity of new feature maps that each layer creates is controlled by a parameter known as the network’s growth rate, which boosts learning capacity while lowering computing costs. DenseNet is made up of “dense blocks” – densely connected sections – and transition layers that link them. By making the feature maps smaller, the transition layer keeps the model lightweight and minimizes needless calculations. This structure guarantees the network’s effective operation while also optimizing the computational cost. DenseNet has gained popularity in fields like medical image analysis and has demonstrated exceptional performance in tasks like picture classification and segmentation. This architecture is a great option for both academic and industrial applications since it offers excellent accuracy with minimal parameters [25].

EFFICIENTNET:

A neural network architecture called EfficientNet was created to maximize computing efficiency while enhancing the performance of deep learning models. In contrast to conventional techniques, this model solves the model scaling issue using a process known as compound scaling. By concurrently adjusting the network’s width, depth, and input resolution, compound scaling offers balanced scaling. In this sense, EfficientNet functions far more efficiently than previous large-scale networks and offers great accuracy with fewer parameters. MBConv blocks, which are utilized similarly to the MobileNet design, constitute the foundation of EfficientNet. These blocks, which lower the model’s computational cost and improve its accuracy, comprise deep separable convolutions and 1×1 compression procedures. Additionally, an effective base model was automatically created thanks to the network’s design process, which employed the NAS (Neural Architecture Search) method. The EfficientNet family (B0 to B7) was created by compound scaling of this base model. In picture classification tasks, EfficientNet has demonstrated remarkable performance, particularly when applied to huge datasets like ImageNet. EfficientNet is commonly utilized in resource-constrained applications such as autonomous systems, mobile devices, and medical image processing because of its excellent accuracy, minimal number of parameters, and computational efficiency. Because of these model characteristics, EfficientNet is a popular option in both academic and business settings [27].

GOOGLENET:

The Inception v1 structure, a ground-breaking deep learning architecture, is introduced by the neural network GoogLeNet. This model was created to minimize the computing cost and boost accuracy in deep neural networks, and it was the first to win the ImageNet competition in 2014. GoogLeNet presents Inception modules, a cutting-edge technique that maintains the benefits of deep networks while lowering processing costs. Inception modules use convolution kernels of various sizes (1×1, 3×3, and 5×5) in tandem to capture both local and global aspects of an image. Additionally, it uses 1×1 convolutions to decrease the number of channels to improve computational efficiency, which enables the model to operate more quickly and use less memory. GoogleNet’s ability to drastically cut down on parameters by eliminating completely linked layers from conventional models is another outstanding feature. This allows the model to use significantly fewer resources while achieving the same degree of accuracy. There are many Inception modules in each of the 22 levels that make up GoogLeNet’s architecture. Furthermore, auxiliary classifiers, or intermediary supervision layers, are employed to improve network optimization. By preventing overfitting during training, these supervision layers enable gradients to propagate across the network more effectively. With its victory in the ImageNet competition, GoogLeNet has had a significant influence on the deep learning community. It has been widely applied to applications including object identification, picture classification, and autonomous systems. The balance it offers between computational cost and accuracy has made GoogLeNet an important model for both academic and industrial applications [28].

RESNET:

A novel structure called ResNet (Residual Neural Network) makes it simpler to train deep neural networks and employ deeper architectures. ResNet, which was first presented by Kaiming He et al in 2015, is notable for resolving the vanishing gradient issue that arises particularly in deep networks. ResNet’s primary characteristic is its use of residual connections. Through these connections, a layer’s input is immediately added to its output, enabling the network to learn only the remaining information. In this manner, information loss is avoided and gradients are distributed throughout the network more efficiently. ResNet has successfully trained deep networks of up to 152 layers because of residual connections. In addition to offering great accuracy, this network topology makes it easier to learn deep models more quickly and steadily. With its remarkable performance in the ImageNet competition, ResNet has had a significant impact and is now a standard model in many applications, particularly transfer learning. It is regarded as a fundamental structure in deep learning architectures and is frequently used in tasks such as object detection, segmentation, and picture classification [26].

EVALUATION METRICS:

A common performance metric in computer vision and object detection applications is the mAP (mean average precision) number [37]. The “Average Precision” (AP) metrics in the information retrieval domain serve as the foundation for the mAP idea.

Precision quantifies the proportion of favorable events that the model predicts that come to pass. In other words, it conveys how accurate the optimistic forecasts were. When a test must highlight only true positives for illness diagnosis, for example, or when you wish to reduce false positives (FP), precision is crucial.

Recall quantifies the number of genuinely positive samples that the model accurately identified. Another name for this is sensitivity. When there are good examples that should not be overlooked, recall is particularly crucial. For instance, since false negatives (FN) can have major repercussions, excellent recall is preferred in the diagnosis of cancer.

The model’s specificity is a measure of how well it predicts negative samples. Its main goal is to lower the rate of false positives (FP). When the cost of false positives is high, specificity is crucial. For instance, false-positive test findings in medicine can result in needless therapies.

The F1-score, which is the harmonic mean of precision and recall, offers a balance between these two. When recall and precision must be traded off, it can be helpful. When the classes are not balanced (for instance, there are few good examples), the F1-score is a useful indicator of overall performance.

How many of the model’s overall predictions come true is known as accuracy. This represents the proportion of all accurate forecasts to the entire dataset. When distributions between classes are balanced, accuracy has meaning. However, in datasets that are unbalanced, it may be deceptive. For instance, the model may attain high accuracy by only predicting negatives in a dataset that contains many negative examples.

Each of these measures, which are used to assess certain facets of a categorization model, is more important in particular situations. Selecting the right metric for assessing model performance requires careful consideration of the issue and the data’s structure. All performance indicators, such as accuracy, precision, recall, specificity, and F1-score, were calculated using conventional definitions and the confusion matrix produced from model predictions. In addition to accuracy, precision, sensitivity, and specificity, we used the F1-score to strike a compromise between precision and recall.

Results

The performance findings presented in this section reflect the influence of the pre-processing stages and dataset configurations indicated in the Methods section. Despite the tiny sample size, image masking, segmentation, and data augmentation techniques helped to greatly improve model accuracy and stability.

For accurately measuring the performance of each model in our study, we used the same hyperparameter settings for all models. Each training process was conducted over 20 iterations, with a fixed learning rate of 1e-3. We utilized the Adam optimizer [35], which combines the advantages of RMSProp and Momentum optimization methods. The Adam optimizer was set up with its default parameter values, with the first moment coefficient (β1) and second moment coefficient (β2) set to 0.9 and 0.999, respectively. The β1 parameter controls the impact of previous gradients, while β2 influences the impact of forgotten gradient information on the current update. We followed this conventional setup in our study since the Adam optimizer is typically used with its default β1 and β2 parameters. Cross-entropy loss was used to measure the difference between the actual and predicted probability distributions. In our classification task, these distributions are represented by categorical labels corresponding to healthy and unhealthy classes. Every input image underwent pre-processing steps, including scaling and normalization, before the model was trained. Each image was resized to 200×200 pixels and normalized using a mean of 0.5 and a variance of 0.5 to ensure consistency in input feature distribution.

To generate a dataset, the data were first labeled. Pre-processing procedures were then carried out to guarantee that the data were used more effectively. Table 1 displays the outcomes of the YOLOv10 analysis. Several YOLO variants were used to compare the findings achieved here. The model that produced the best results was then used to execute detection operations on the images. Transfer learning was carried out when data with a confidence score – that is, a detection percentage higher than 51% – were added to the dataset. The malignant image that was discovered has 2 classifications (positive and negative), which is why this percentage was chosen.

The highest 82.5% was obtained in the YOLOv10-N model, and the lowest 79.2% was obtained in the YOLOv10-M model. In general, mAP50 values are close between the models, but YOLOv10-N and YOLOv10-S performed better. The highest precision was obtained by the YOLOv10-B model, with 87.9%, followed by YOLOv10-N. This shows that these models have a high rate of accuracy in positive predictions. The highest recall value was obtained by the YOLOv10-S model, with 77.3%. However, the recall values of the YOLOv10-B and X models are lower (73.6% and 74.3%, respectively), which shows that they missed some of the positive examples. In this metric, which provides a balance between precision and recall, the best performance was seen in the YOLOv10-N and YOLOv10-B models, with 78.5% and 77.9%, respectively. YOLOv10-M model exhibited lower performance (76.0%) than other models in terms of F1-score. The lowest loss was obtained in the YOLOv10-X model, at 1.16. This indicates that the binding boxes are better optimized. The highest loss was in the YOLOv10-N model, with 1.37. The lowest value in terms of class loss was seen in the YOLOv10-X model, with 0.55. This indicates a better performance in class prediction. The YOLOv10-N model had a higher loss in this area (0.72). The lowest loss value was obtained in the YOLOv10-X model, with 0.89, which indicates that the distribution-driven predictions are better optimized. The lowest validation loss was in the YOLOv10-N model, with 1.63. Although the loss values in other models were slightly higher (1.67–1.76), they were at an acceptable level. The lowest value was 0.83 in the YOLOv10-X model, indicating that this model has the best performance in class prediction on the validation data. Very similar values (0.98–1.02) were obtained among all models, indicating that the models generally performed similarly in distribution prediction.

YOLOv10-X stands out with its low values, especially in training loss and validation loss metrics. It also showed a competitive performance in important metrics such as mAP50 and F1-score. YOLOv10-N and S models reached the highest values in critical metrics such as mAP50–95 and Precision. YOLOv10-M showed lower performance compared to the others and may need improvement. In general, the YOLOv10-X and YOLOv10-N models had the most balanced results in terms of performance and loss values.

The image in Figure 6 contains axial CT images processed using the YOLOv10 algorithm to detect cancer metastases. Each panel (A, B, C, and D) represents different anatomical sections of the thoracic region, and potential metastatic sites were detected and marked with border boxes. The values shown in the boxes represent the confidence scores (ranging from 0.59 to 0.89), expressing the probability of the detected lesions being metastases. Consistent detections in different sections demonstrate the potential of the YOLOv10 algorithm as an auxiliary tool in diagnostic imaging. This method can make significant contributions in reducing the workload of radiologists and increasing detection accuracy by ensuring the correct determination of metastatic sites for early intervention, especially in oncological cases. These images show the performance of a YOLO-based model in detecting potentially cancerous areas on computed tomography (CT) scans. In each sub-image, the model shows the regions it identified with the label “Positive” and the confidence scores of these detections. In Figure 6A, a positive detection was made in the lung region, with a high confidence score (0.89), indicating that the model predicted the probability of the region being cancerous with high accuracy. Figure 6B contains a relatively less pronounced positive detection with a lower confidence score (0.76). In Figure 6C and 6D, multiple regions were detected; the confidence scores ranged from 0.59 to 0.83. This demonstrates the model’s ability to detect multiple abnormal areas in the same section. In general, we observed that the model’s detections had high confidence scores and successfully marked the potentially cancerous areas. These results suggest that the model has a good level of accuracy for medical image analysis.

Figure 7 shows the losses and performance metrics in the training process of a YOLO-based model in detail. The upper-row graphs show how the model’s losses decrease during training. Box loss represents the model’s capacity to learn the correct locations of objects, and this loss decreases regularly during the training process. Similarly, class loss measures the model’s performance in correctly predicting classes, and a decrease in loss values indicates that the model has improved in this area. DFL loss (distribution focal loss) is a metric aimed at optimizing errors in distribution-based predictions, and a decrease in loss indicates that the model has learned probability distributions better. The lower-row graphs show the improvement in performance metrics. The recall value shows how well the model can detect positive examples, and it increases regularly throughout the training. mAP50 measures the overall accuracy of the model when the IoU (Intersection over Union) threshold is 50%, and the graph shows that this value increased continuously during the training process and reached approximately 0.8. A more challenging metric, mAP50–95, evaluates IoU thresholds in a range from 50% to 95%, and the increase in this metric reveals that the model improved in complex cases, but remained limited to around 0.4. The precision metric expresses the accuracy of the examples that the model predicts as positive, and the graph shows that this value increased steadily throughout the training process and reached 0.8. In general, the steady decrease in training losses proves that the model performs progressively better in object detection and class prediction with each iteration. The continuous increase in performance metrics shows that the model increases its accuracy and overall performance. However, the relatively lower performance of the model at more complex IoU thresholds such as mAP50–95 indicates that further improvement may be needed. The model had stable development throughout the training process and achieved strong performance in overall accuracy.

The data were divided into 3 subsets: training, validation, and testing; 20% of the training data was classified as validation data and 80% as training data. The results are given in Table 2.

The graph in Figure 8 shows the change in loss and accuracy values during the training process of a model. The loss graph on the left showed that training and validation losses decreased as the number of epochs increased. This shows that the learning process of the model was successful on both datasets.

However, the fluctuations in the validation loss suggest that the model occasionally makes unstable predictions on the validation data and that there is a potential risk of overfitting. The accuracy plot on the right shows that both training and validation accuracies increased rapidly and reached approximately 95–100%. The training and validation accuracies were quite close to each other, indicating that the generalization capacity of the model is strong and it does not overfit the training data.

The ResNet-50 model achieved the highest accuracy, with 96.4%. This shows that the overall performance of the model is better than other models. ResNet-18 (95.8%) and EfficientNet-B1 (95.3%) models also stand out with their high accuracy values. ResNet-18 model had the highest precision value, at 92.2%. This indicates that most of the examples that the model predicted as positive were correct. The ResNet-50 (90.3%) and EfficientNet-B1 (89.0%) models are also strong in terms of precision. ResNet-50 model has the highest sensitivity value, at 94.1%. This shows that the model correctly captured a large portion of the positive examples. The ResNet-18 (89.9%) and EfficientNet-B1 (87.4) models also performed well in terms of sensitivity. EfficientNet-B1 model had the highest specificity value, at 97.3%. This means that it correctly detected a large portion of the negative examples. The ResNet-18 (97.6%) and ResNet-50 (97.1%) models are also quite strong in terms of specificity. ResNet-50 had the highest F1-score, at 92.1%. This shows that it provides the best balance between precision and sensitivity. The ResNet-18 (91.0%) and EfficientNet-B1 (88.2%) models are also notable in terms of F1-score. ResNet-50 provided a balanced performance among all metrics and achieved the best results overall. ResNet-18 is a strong model, especially in terms of precision and specificity, but it falls behind ResNet-50 in terms of sensitivity. EfficientNet-B1 is the best model in terms of specificity and is also quite successful in overall accuracy, but it falls slightly behind in terms of precision and sensitivity. DenseNet-161 performed poorly compared to other models, with 92.7% accuracy, and was below average in all metrics. GoogleNet performed well, with 94.5% accuracy and 89.2% precision, but fell behind the ResNet model. This table shows that ResNet-50 provided the best overall performance and outperformed other models in terms of accuracy and sensitivity. However, in cases where certain metrics (such as specificity or precision) need to be prioritized, models such as EfficientNet-B1 or ResNet-18 may be preferred. The DenseNet model performed poorly.

Extra images taken using the Yolov10 model were added to the dataset and added back to the transfer learning model. The results obtained after taking these images are shown in Table 3.

In the analyses performed with the first dataset, the performance of different deep learning models was compared and these results are presented in Table 2. The performances of the DenseNet-121, DenseNet-161, EfficientNet-B0, EfficientNet-B1, GoogLeNet, ResNet-18, and ResNet-50 models according to basic metrics such as Accuracy, Precision, Recall, Specificity, F1-score, and ROC-AUC are detailed. In general, the ResNet-50 model showed the highest performance on the first dataset, with 96.39% Accuracy and 95.59% ROC-AUC. Similarly, the ResNet-18 model stood out in terms of Precision (92.22%). These evaluations show how the distribution of examples in the first dataset and the interaction of the models with these data affect the modeled results. In the second dataset, a significant improvement was observed in the performance of all models with the optimizations made and the enrichment of the dataset. According to the results in Table 3, the GoogLeNet model exhibited the highest performance, with 97.34% Accuracy and 96.88% ROC-AUC. In addition, the EfficientNet-B0 model stood out, with 95.29% Precision and 94.70% Recall. Significant improvements were also recorded in ResNet-18 and DenseNet-121. These results show that the model performances became stronger due to the better balancing, expansion, or optimization of the second dataset. The second dataset provided a structure that better represented the deeper learning capacity of the models and provided an accuracy closer to real-world applications.

The results obtained from 2 experimental settings – one using only the original dataset and the other incorporating images generated by the YOLO10 model into the original dataset – revealed differences in model performance. It is important to emphasize that all models demonstrated high classification success in all measurements in both experimental settings. This indicates that the models used in this study can be effectively employed for the classification of pulmonary metastases of breast cancer. When trained on the original dataset, the ResNet50 and ResNet18 architectures outperformed the others. This shows that the efficiency of the skip-connected layers in the ResNet architecture in providing gradient flow has a positive effect on the model performance. In contrast, when the dataset was augmented with YOLO10-generated images, the GoogLeNet and ResNet18 architectures achieved superior performance compared to others. ResNet18’s steady performance in both experimental configurations suggests that moderately deep neural networks with skip connections may be successful in classifying breast cancer pulmonary metastases. Additionally, GoogLeNet’s parallel convolutional layers with variable kernel sizes enable the model to collect characteristics of different scales at the same time, which might account for its better performance when there are more training examples available. Based on these findings, we propose several key recommendations: since all models achieved high classification accuracy, an ensemble model incorporating all architectures could be employed to obtain more robust results; the majority voting strategy could be used to determine the final classification decision; a custom model combining the strengths of both ResNet and GoogLeNet architectures could be developed to leverage their complementary advantages for pulmonary metastases of breast cancer classification.

Discussion

To improve the detection of lung metastases due to breast cancer, a deep learning model was created using YOLOv10 and transfer learning methods. The limited sample size (16 patients) used can be considered a limitation in terms of generalizability of the study. However, the most important point to emphasize is the number of images, which is the main data unit used in training the model. A total of 1264 labeled CT images were used, and the learning capacity of the model was increased with this large amount of data. In addition, thanks to the object detection capabilities of YOLOv10, maximum information extraction was achieved from images obtained from a small number of patients. In particular, YOLO’s object localization and classification capabilities ensured the precise determination of metastatic lesions, thus reducing the need for data augmentation and contributing to the model’s high performance with less data. This provides a significant advantage for overcoming the problem of insufficient data, which is frequently encountered in deep learning models.

The effectiveness of the various deep learning models employed in the study to detect pulmonary metastases of breast cancer is thoroughly covered below. The success of the models in metrics such as accuracy, precision, recall, specificity, and F1-score are analyzed according to the difficulty level of the classification problem and the characteristics of the dataset. The study results show that ResNet-50 and ResNet-18 stand out in terms of high accuracy and overall performance, EfficientNet-B1 is the most successful model in the specificity metric, but DenseNet had worse performance than other models. In line with the information obtained from graphs such as ROC curve and confusion matrix, the capacity of the models to distinguish positive and negative classes was evaluated together with the reasons for misclassifications. It also aims to determine which models are more suitable in which scenarios by comparing the strengths and weaknesses of the models in different metrics. For example, ResNet-50 stands out as an ideal model in terms of sensitivity and accuracy, while EfficientNet-B1 was considered a preferable option in cases requiring specificity. However, the lower performance of DenseNet may be a result of limitations specific to the dataset or model architecture, and this is an area of improvement for future studies. Overall, this study highlights the potential of deep learning models in classification tasks and shows that the results provide important information that can shed light on both theoretical and practical applications. Furthermore, the acquired results contribute significantly to the literature and validate the potential of such models in essential applications such diagnosing lung metastases of breast cancer. Our model showed better accuracy and AUC values in identifying metastatic lesions when compared to earlier research. For example, our model reached up to 97.3% accuracy for lung metastasis diagnosis, whereas Wang et al reported an accuracy of 93% in lymph node metastasis prediction using dual-energy CT [38]. Similarly, Zhao et al’s 3D cross-modality network for lymph node metastasis produced an AUC of 0.926 [17], while our model produced an AUC of 0.99. These comparisons demonstrate the higher performance and potential for earlier and more accurate metastasis identification provided by our method, which combines YOLOv10 and transfer learning. Our study fills a critical gap in the literature by focusing on lung metastasis detection in breast cancer patients using CT images, as opposed to other efforts that concentrated on lymph node metastases or other imaging modalities.

Figure 9 evaluates the classification performance of the model through the confusion matrix and ROC curve. According to the confusion matrix, the model’s true-positive (176) and true-negative (626) predictions are quite high, while the false-positive (19) and false-negative (11) rates are low, which shows that the model generally performs balanced and accurate classification. The ROC curve presents the relationship between the true-positive and false-positive rates of the model at different threshold values. The AUC value calculated for both classes is 0.99, which indicates that the classification performance of the model is almost perfect. In general, the model distinguishes both positive and negative classes with high accuracy and stands out as a successful classifier.

Figure 10 analyzes the classification performance of the model after improvements in the dataset or model in detail using the confusion matrix and ROC curve. According to the accuracy matrix, the number of true positives (328) and the number of true negatives (622) are quite high, which shows that the model can distinguish the classes correctly. The low number of false positives (10) and false negatives (16) supports the balance in classification and the robustness of the model. This improvement in performance shows that the model’s capacity to reduce error rates and obtain more reliable results during the training process has increased. The ROC curve visualizes the relationship between the true-positive rate (sensitivity) and the false-positive rate of the model at different classification thresholds. In this case, the AUC value calculated for both classes was 1.00, which shows that the model has excellent classification performance. The fact that the curve is almost perfect emphasizes the model’s capacity to distinguish 2 classes without any overlap. These results demonstrate the success of the improvements made in the model training and validation process and confirm the usability of the model in critical applications requiring high accuracy and low error rate. In conclusion, this analysis shows that the model can be successfully used as a reliable and accurate classifier.

Wang et al [16], proposed a model based on multiple energy levels for the prediction of lymph node metastasis (Nmet). The model was specifically designed with dual-energy computed tomography (CT) data obtained at different energy levels using Gemstone Spectral Imaging (GSI). The energy levels used were divided into 3 as low energy fusion (40, 50, 60, 70 keV), high energy fusion (110, 120, 130, 140 keV) and average energy fusion (40, 70, 100, 140 keV). The model performed best with 93% accuracy and 86% Kappa value when trained with low-energy-level data. The 5-fold cross-validation results revealed that the model based on low energy level was more powerful and consistent in Nmet prediction. In addition, low energy levels provided more information about tumor angiogenesis and heterogeneity, thus increasing the success of the model in Nmet prediction. The accuracy value (93) is lower than in our study (97.34). Also, other metrics were not given and therefore could not be compared.

Zhao et al [17] proposed a cross-modality 3D deep learning model named DensePriNet for the accurate prediction of lymph node (LN) metastasis in patients with clinical stage T1 lung adenocarcinoma. The model predicts the LN status by combining preoperative CT images and prior clinical information. A total of 501 patients were used in the study, of which 401 were used for training and validation of the model, and 100 were reserved for testing. The performance of DensePriNet was compared with logistic regression model, deep learning model alone, radiomics methods, and manual assessments by radiologists. The results showed that the AUC value of DensePriNet was higher than with other methods (logistic regression: 0.904, deep learning alone: 0.880, radiomics methods: 0.891), at 0.926. In addition, the Matthews Correlation Coefficient (MCC) value of 0.705 significantly exceeded the performance of both a senior radiologist (0.534) and a junior radiologist (0.416). These results indicate that the integration of clinical information into the deep learning model increases the accuracy of predicting LN metastasis and facilitates the development of individualized treatment plans. In our study, the AUC value was 0.99 and the accuracy was also higher.

Wang et al [38] proposed a new deep learning method with size-related damper block for accurate prediction of lymph node metastasis (Nmet) from primary tumors in lung cancer. The model uses monochromatic images generated with Gemstone Spectral Imaging (GSI) and dual-energy computed tomography (CT) data. The model trained at 40 keV energy level exhibited the best performance, with 86% accuracy and 72% Kappa value in Nmet prediction. In the study, 11 different monochromatic images between 40 and 140 keV with 10 keV intervals were used for each patient. The model trained at 40 keV showed a significant difference compared to the models trained at other energy levels. The 5-fold cross-validation results revealed that lower energy levels (such as 40 keV) were more effective for Nmet prediction. The findings show that tumor heterogeneity and size contribute significantly to the success of the model in predicting the presence or absence of lymph node metastasis. Accuracy values are lower. A more accurate method was followed by subtracting the mean values with cross-validation.

Li et al [18] presented an innovative deep learning framework for the diagnosis of distant metastasis (DM), which is the leading cause of death in advanced lung cancer. Although PET scans are one of the standard methods for diagnosing DM, they have limited clinical application due to their high cost and the harmful effects of contrast agent use. To address these issues, this study developed a model called Mask-Guided Two-stream Attention (MGTA). The model uses the 3D Pseudo-Siamese Feature Pyramid Network (PSFPN) mechanism to learn both the global features of the whole lung and the local features of the tumor, and includes the Deep Cascade Attention Module (DCAM) to combine these features. The main advantage of MGTA is that it is the first model to extract tumor and whole-lung information simultaneously for DM prediction. While existing methods only focus on tumor regions, this model considers rich information from the entire lung by using the mask-guided attention mechanism. This innovative approach offers significant potential in terms of accuracy and clinical applicability. It is discussed in a different framework in this study. In addition, it has presented an innovative method because it was done in a mask-guided manner. The results were lower than in our study.

Grossman et al [39] differentiated between small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) in lung cancer-related brain metastases using deep learning and transfer learning approaches using conventional magnetic resonance imaging (MRI) data. A total of 69 patients – 44 NSCLC and 25 SCLC – were included in the study. EfficientNet architecture was used for classification on cropped images of lesion areas, and the model was trained on contrast-enhanced T1-weighted, T2-weighted, and FLAIR imaging data. Model evaluation was performed using 5-fold cross-validation method and accuracy, precision, sensitivity, and F1 score, and area under the ROC curve (AUC) metrics were used. The best classification results were obtained with multi-parameter MRI input data (T1WI+c+FLAIR+T2WI). This method provided an average of 90% accuracy and an F1 score of 0.92 for NSCLC and 0.87 for SCLC on validation data. On test data, the accuracy was 87%, with F1 scores of 0.88 for NSCLC and 0.85 for SCLC. Although the study achieved high results on validation data, the performance on test data was comparatively lower.

Guo et al [40] proposed a network consisting of 3 stages: pre-training, transfer learning, and fusion of features from 2 image views. In the first stage, the model was trained on a source dataset consisting of chest X-ray images. Then, it was adapted to the target dataset consisting of clinically obtained scintigraphy images by transfer learning method. In the last stage, the presence or absence of metastasis was determined by combining the features extracted from anterior and posterior views. Experiments conducted on the dataset consisting of clinical scintigraphy images showed that the model was successful in automatically classifying metastasis images. The average values were calculated as accuracy (0.7710), precision (0.8311), sensitivity (0.6827), and F1 score (0.7475). The proposed classification network was more successful in detecting lung metastases compared to existing methods and provided an effective solution that can be used in such diagnostic tasks.

Bone metastasis is one of the most common diseases in breast, lung, and prostate cancers, and the basic imaging method that offers the highest sensitivity (95%) in screening for metastases is bone scintigraphy [41]. This research focused on breast cancer patients and examined artificial intelligence methods and deep learning algorithms for bone metastasis diagnosis. In that study, deep learning and convolutional neural networks (CNN), which are powerful algorithms for automatic classification and diagnosis of medical images, were used. The aim of the study was to develop a robust CNN model that can classify whole-body scan images according to the presence of breast cancer metastases. A robust CNN architecture was selected that achieved a high classification accuracy of 92.50% on whole-body scan images for bone metastasis diagnosis. In addition, the proposed CNN method was compared with popular CNN architectures such as ResNet50, VGG16, MobileNet, and DenseNet, which are frequently reported in the literature, and provided superior classification accuracy. The proposed deep learning approach has shown high effectiveness in nuclear medicine for bone metastasis diagnosis in breast cancer patients. The results obtained are consistent and high. The accuracy, precision, and recall appear to be lower than in our study.

Botlagunta et al [42] used histopathological information on malignancy for cancer classification, despite some limitations. The aim of the study was to develop a non-invasive breast cancer classification system for the diagnosis of cancer metastases. Various Python modules were developed on the Anaconda-Jupyter Notebook platform for text mining, data processing, and machine learning methods. The study used cross-validation criteria such as accuracy, AUC, and ROC to evaluate the prediction performance of the classification models. The statistical significance of the data was analyzed by Welch independent t-test. The text mining framework from electronic medical records facilitated the separation of blood profile data and the identification of MBC patients. It was found that monocytes showed a significant mean difference between healthy individuals and MBC patients. Removal of outliers from blood profile data significantly increased the accuracy of machine learning models. Decision trees classifier performed best, achieving 83% accuracy and 0.87 AUC. Then, DT classifiers were integrated into a web application using Flask, and a system was developed for robust diagnosis of MBC patients. Results show that ML models based on blood profile data can help select MBC patients requiring intensive care and can improve overall survival outcomes. The system was developed for robust diagnosis of MBC patients. Results show that ML models based on blood profile data can help identify MBC patients requiring intensive care and may improve overall survival outcomes. When compared with our study, the recall value reported by Botlagunta et al [42] was very high, which is mainly related to the dataset characteristics and the results they obtained. However, when the accuracy values and other performance metrics were compared with those of our study, lower values were observed.

Zhong et al [43] aimed to develop more accurate prediction models based on innovative machine learning algorithms and to provide effective decision-making support to clinicians. Breast cancer patients registered in the Surveillance, Epidemiology, and End Results (SEER) database between 2010 and 2016 were retrospectively analyzed. Multivariate logistic regression analyses were applied to determine risk factors for bone metastasis in breast cancer, and Cox proportional hazard regression analyses were used to determine prognostic factors for breast cancer with bone metastasis (BCBM). Based on risk and prognostic factors, diagnostic and prognostic models including 6 machine learning classifiers were developed. The performance of the models was evaluated using area under the ROC curve, learning curve, accuracy curve, calibration plot, and decision curve analysis. Univariate and multivariate logistic regression analyses showed that bone metastases were significantly associated with age, race, sex, tumor grade, T stage, N stage, surgery, radiotherapy, chemotherapy, tumor size, brain metastasis, liver metastasis, lung metastasis, breast subtype, and PR. Cox regression analyses revealed that age, race, marital status, grade, surgery, radiotherapy, chemotherapy, brain metastasis, liver metastasis, lung metastasis, breast subtype, ER, and PR were closely associated with the prognosis of BCBM. Among the 6 machine learning models, the XGBoost algorithm provided the most accurate prediction results (Diagnostic model AUC=0.98; Prognostic model AUC=0.88). Shapley additive explanations (SHAP) analysis showed that the most important feature of the diagnostic model was surgery, followed by N stage. Interestingly, surgery was also identified as the most critical feature of the prognostic model, followed by liver metastasis. Based on the XGBoost algorithm, it could effectively predict the diagnosis and survival time of bone metastasis in breast cancer and provide targeted references for the treatment of BCBM patients. Their study reported very high performance, with AUC scores of 0.98 for the diagnostic model and 0.88 for the prognostic model. Also, the precision values are extremely high. It is an important value in terms of the accuracy and usability of the study. However, when other performance metrics are examined, the results obtained in our study are higher. “The findings, including the results of other studies in the literature and our study, are summarized in Table 4.

Thoracic CT images taken for lung metastasis control of breast cancer patients can be scanned quickly, sensitively, and with high specificity if the application developed from our study can be integrated into hospital information management systems. This will save time, detect metastases early, and help plan treatment for patients who have metastasis.

One limitation of our work is the lack of systematic bias detection assessments. Our dataset was from a small, single-center cohort, which may limit generalizability. Future research will evaluate demographic and clinical biases to assure fairness and dependability in larger groups.

As a next step, we plan to create an API that integrates the learned YOLOv10 model for automated lung metastasis diagnosis. Future work will include prospective clinical validation and incorporation into radiology workflows to help make real-time decisions. The single-center aspect of our dataset, in addition to the small sample size, may limit the generalizability of our findings to larger populations with distinct clinical or demographic features. Additionally, the results may have been affected by the retrospective design and possible selection bias. To confirm the effectiveness of our model and guarantee its suitability in various clinical contexts, more research including bigger, multi-center cohorts and prospective designs is required.

Conclusions

Based on our study’s findings, it is possible to use CT scans and our algorithm to identify lung metastases of breast cancer early on. This will improve the likelihood of treatment, begin treatment before treatment options become limited, and offer numerous advantages, including a reduced workload for radiologists.

This study highlights the advantages of the YOLOv10 model in working with small datasets and the effectiveness of the transfer learning method. The ability of YOLOv10 to obtain high-accuracy results with a small amount of data provides a significant advantage in areas requiring limited and sensitive data such as medical image analysis. In particular, by using the transfer learning strategy, the information obtained from the pre-trained models was adapted to a new dataset and the classification performance was increased. This approach supported the ability to correctly distinguish positive and negative classes despite the small amount of data. The integration of YOLOv10 with transfer learning increased the generalization capacity of the model in limited datasets and allowed for more reliable results. This provides an effective and practical solution, especially for researchers and practitioners working with small datasets. The obtained results show that the combination of YOLOv10 and transfer learning is a promising approach to overcome data limitations in medical and other applications.

This study evaluated the potential of deep learning-based models in critical tasks such as medical image analysis by comparing the classification performance of different model architectures. Commonly used models such as ResNet, DenseNet, EfficientNet, and GoogLeNet were comprehensively analyzed with metrics such as accuracy, precision, sensitivity, specificity, and F1-score. The main purpose of the study was to determine the strengths and weaknesses of these models in their classification performance and to reveal which model is more suitable in which scenarios. In the methodology used, the success of the models on both positive and negative classes has been analyzed in detail with evaluation tools such as confusion matrices and ROC curves.

The results show that ResNet-50 is the most successful model in metrics such as high accuracy (96.4%), sensitivity (94.1%) and F1-score (92.1%). ResNet-18 showed strong performance in terms of precision (92.2%) and specificity (97.6%), and EfficientNet-B1 achieved the highest success, with 97.3% specificity. GoogleNet and DenseNet models showed lower performance in general, but produced meaningful results in certain scenarios. The high AUC values (0.99%) obtained from ROC curves show that the models are quite reliable in the classification task. While this study confirms the potential of deep learning models in sensitive applications such as medical image analysis, it emphasizes the importance of conducting studies on larger and balanced datasets in the future to improve the performance of these models. In addition, the obtained results provide important clues on how the models can be optimized in real-world applications. The results obtained after adding to the dataset show that GoogLeNet in particular exhibited the highest performance among all models, with 97.3% accuracy and 98.4% specificity. ResNet-50 and EfficientNet-B0 models also stood out, with over 96% accuracy and high ROC-AUC values. These results show that the additions and optimizations made to the dataset significantly increased the overall performance of the models. The success achieved by GoogLeNet in particular indicates that this model may be preferred in critical applications such as medical image analysis. This stage of the study once again reveals the importance of dataset quality and scope for better adaptation of deep learning models to real-world applications. In conclusion, our work demonstrates the high diagnostic specificity and accuracy attained by combining YOLOv10 with transfer learning, allowing for the accurate and timely identification of lung metastases in patients with breast cancer. These findings suggest that our model can be useful in directing treatment planning, enhancing patient outcomes, and lessening the workload of radiologists. To validate these encouraging results and enable practical clinical integration, more testing on larger, more varied datasets is necessary. Although our findings suggest that deep learning models such as YOLOv10 may help radiologists by enhancing diagnostic efficiency and confidence, more prospective research is required to measure any effects on workload and workflow before practical deployment can be suggested. Prompt and precise identification of lung metastases may facilitate prompt treatment choices, ultimately improving patient quality of life and survival rates. Additionally, incorporating these AI capabilities into hospital information systems could help standardize reporting and lower inter-observer variability, which would encourage more uniform patient care in various clinical contexts. Furthermore, our findings outperform past research in terms of diagnostic accuracy, including Wang et al and Zhao et al, suggesting the possibility of earlier and more accurate detection of lung metastases in breast cancer. Improved patient outcomes and more efficient clinical workflows could result from this development. Although our study produced useful results, we are aware of its limitations, which include its single-center design and limited sample size, which may have an impact on generalizability. Multi-center prospective studies are needed to confirm our findings before the model is used in actual clinical settings. Overall, our method closes a gap in the literature by proving the viability and benefits of integrating YOLOv10 with transfer learning for accurate metastatic identification.

Figures

Figure 1. Flowchart of the study.

Figure 2. (A–D) Pre-processing Study on the Image.

Figure 3. Yolov10 working principle [24].

Figure 4. General CNN architecture [19].

The general scheme of the proposed framework model. All pre-trained models were used for training in order. In the scheme, the selected pre-trained model is shown by the line.

Figure 5. The general scheme of the proposed framework model. All pre-trained models were used for training in order. In the scheme, the selected pre-trained model is shown by the line.

Figure 6. (A–D) Results with YOLOv10.

Figure 7. YOLOv10 results obtained.

Figure 8. Loss and accuracy graph obtained.

Figure 9. According to the first results obtained, the confusion matrix and ROC curve.

The classification performance of the model after improvements in the dataset or model in detail using the confusion matrix and ROC curve.

Figure 10. The classification performance of the model after improvements in the dataset or model in detail using the confusion matrix and ROC curve.

Tables

Table 1. YOLOv10 model performance and loss metrics.

Table 2. The results of the models with the first dataset.

Table 3. The results of the models with the final dataset.

Table 4. Comparison of the study with other studies.

References

1. Sung H, Ferlay J, Siegel RL, Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries: Cancer J Clin, 2021; 71(3); 209-49

2. Siegel RL, Miller KD, Wagle NS, Jemal A, Cancer statistics, 2023: Cancer J Clin, 2023; 73(1); 17-48

3. Emens LA, Davidson NE, The follow-up of breast cancer: Semin Oncol, 2003; 30(3); 338-48

4. Torre LA, Bray F, Siegel RL, Global cancer statistics, 2012: Cancer J Clin, 2015; 65(2); 87-108

5. Ettinger DS, Akerley W, Bepler G, Non-small cell lung cancer: J Natl Compr Canc Netw, 2010; 8(7); 740-801

6. Siegel RL, Miller KD, Jemal AJC, Cancer statistics, 2018: Cancer J Clin, 2018; 68(1); 7-30

7. Chia SK, Speers CH, D’Yachkova Y, The impact of new chemotherapeutic and hormone agents on survival in a population-based cohort of women with metastatic breast cancer: Cancer, 2007; 110(5); 973-79

8. Gennari A, Conte P, Rosso R, Orlandini C, Bruzzi P, Survival of metastatic breast carcinoma patients over a 20-year period: A retrospective analysis based on individual patient data from six consecutive studies: Cancer, 2005; 104(8); 1742-50

9. Rueda JR, Sola I, Pascual A, Casacuberta MS, Non-invasive interventions for improving well-being and quality of life in patients with lung cancer: Cochrane Database Syst Rev, 2011; 2011(9); CD004282

10. Jafari SH, Saadatpour Z, Salmaninejad A, Breast cancer diagnosis: Imaging techniques and biochemical markers: J Cell Physiol, 2018; 233(7); 5200-13

11. Li X, Zhang S, Zhang Q, Diagnosis of thyroid cancer using deep convolutional neural network models applied to sonographic images: A retrospective, multicohort, diagnostic study: Lancet Oncol, 2019; 20(2); 193-201

12. Esteva A, Kuprel B, Novoa RA, Dermatologist-level classification of skin cancer with deep neural networks: Nature, 2017; 542(7639); 115-18

13. Litjens G, Sánchez CI, Timofeeva N, Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis: Sci Rep, 2016; 6(1); 26286

14. Cireşan DC, Giusti A, Gambardella LM, Schmidhuber J, Mitosis detection in breast cancer histology images with deep neural networks: Med Image Comput Comput Assist Interv, 2013; 16(Pt 2); 411-18

15. Ertosun MG, Rubin DL, Automated grading of gliomas using deep learning in digital pathology images: A modular approach with ensemble of convolutional neural networks: AMIA Annu Symp Proc, 2015; 2015; 1899-908

16. Wang YW, Chen CJ, Wang TC, Multi-energy level fusion for nodal metastasis classification of primary lung tumor on dual energy CT using deep learning: Comput Biol Med, 2022; 141; 105185

17. Zhao X, Wang X, Xia W, A cross-modal 3D deep learning for accurate lymph node metastasis prediction in clinical stage T1 lung adenocarcinoma: Lung Cancer, 2020; 145; 10-17

18. Li Z, Wang S, Yu H, A novel deep learning framework based mask-guided attention mechanism for distant metastasis prediction of lung cancer; 330-41

19. Tas HG, Tas MBH, Irgul B, Accurate diagnosis of COVID-19 from lung CT images using transfer learning: Eur Rev Med Pharmacol Sci, 2024; 28(3); 1213-26

20. Selvaraju RR, Cogswell M, Das A, Vedantam R, Grad-cam: Visual explanations from deep networks via gradient-based localization; 618-26

21. Dai Y, Wang G, Li KC, Conceptual alignment deep neural networks: J Intell Fuzzy Syst, 2018; 34(3); 1631-42

22. Li O, Liu H, Chen C, Rudin C, Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions

23. Chen C, Li O, Tao D, This looks like that: Deep learning for interpretable image recognition: Adv Neural Inf Process Syst, 2019; 32; 1-12

24. Wang A, Chen H, Liu L, Yolov10: Real-time end-to-end object detection: Adv Neural Inf Process Syst, 2024; 37; 107984-8011

25. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ, Densely connected convolutional networks; 4700-8

26. He K, Zhang X, Ren S, Sun J, Deep residual learning for image recognition; 770-78

27. Tan M, Le Q, Efficientnet: Rethinking model scaling for convolutional neural networks; 6105-14

28. Szegedy C, Liu W, Jia Y, Going deeper with convolutions

29. Redmon J, You only look once: Unified, real-time object detection

30. Zong Z, Song G, Liu Y, Detrs with collaborative hybrid assignments training

31. Chen Y, Chen Q, Hu Q, Cheng J, Date: Dual assignment for end-to-end fully convolutional object detection, 2022 arXiv preprint arXiv: 2211.13859

32. Ghiasi-Shirazi K, Generalizing the convolution operator in convolutional neural networks: Neural Process Lett, 2019; 50(3); 2627-46

33. Torrey L, Shavlik J, Transfer learning: Handbook of research on machine learning applications and trends: Algorithms, methods, and techniques, 2010; 242-64, IGI Global Scientific Publishing

34. Pan SJ: Transfer learning Data classification algorithms and applications, 2020; 34, New York, CRC Press, Taylor & Francis Group

35. Weiss K, Khoshgoftaar TM, Wang DD, A survey of transfer learning: J Big Data, 2016; 3; 1-9

36. Zhao Z, Alzubaidi L, Zhang J, A comparison review of transfer learning and self-supervised learning: Definitions, applications, advantages and limitations: Expert Syst Appl, 2024; 242; 122807

37. Everingham M, Van Gool L, Williams CK, The pascal visual object classes (voc) challenge: Int J Comput Vis, 2010; 88; 303-38

38. Wang YW, Chen CJ, Huang HC, Dual energy CT image prediction on primary tumor of lung cancer for nodal metastasis using deep learning: Comput Med Imaging Graph, 2021; 91; 101935

39. Grossman R, Haim O, Abramov S, Differentiating small-cell lung cancer from non-small-cell lung cancer brain metastases based on MRI using efficientnet and transfer learning approach: Technol Cancer Res Treat, 2021; 20; 1-7

40. Guo Y, Lin Q, Wang Y, Integrating transfer learning and feature aggregation into self-defined convolutional neural network for automated detection of lung cancer bone metastasis: J Med Biol Eng, 2023; 43(1); 53-62

41. Papandrianos N, Papageorgiou E, Anagnostis A, Feleki A, A deep-learning approach for diagnosis of metastatic breast cancer in bones from whole-body scans: Appl Sci, 2020; 10(3); 997

42. Botlagunta M, Botlagunta MD, Myneni MB, Classification and diagnostic prediction of breast cancer metastasis on clinical data using machine learning algorithms: Sci Rep, 2023; 13(1); 485

43. Zhong X, Lin Y, Zhang W, Bi Q, Predicting diagnosis and survival of bone metastasis in breast cancer using machine learning: Sci Rep, 2023; 13(1); 18301

Introduction Material and Methods Results Discussion Conclusions References

Related articles Order reprints Share article Share by email

Figures

Figure 1. Flowchart of the study.

Figure 2. (A–D) Pre-processing Study on the Image.

Figure 3. Yolov10 working principle [24].

Figure 4. General CNN architecture [19].

Figure 5. The general scheme of the proposed framework model. All pre-trained models were used for training in order. In the scheme, the selected pre-trained model is shown by the line.

Figure 6. (A–D) Results with YOLOv10.

Figure 7. YOLOv10 results obtained.

Figure 8. Loss and accuracy graph obtained.

Figure 9. According to the first results obtained, the confusion matrix and ROC curve.

Figure 10. The classification performance of the model after improvements in the dataset or model in detail using the confusion matrix and ROC curve.

Tables

Table 1. YOLOv10 model performance and loss metrics.

Table 2. The results of the models with the first dataset.

Table 3. The results of the models with the final dataset.

Table 4. Comparison of the study with other studies.

Table 1. YOLOv10 model performance and loss metrics.

Table 2. The results of the models with the first dataset.

Table 3. The results of the models with the final dataset.

Table 4. Comparison of the study with other studies.

In Press

Clinical Research
Analysis of the Clinical Characteristics and Endoscopic Features of Phytobezoar-Induced Ulcers and Gastric ...

Med Sci Monit In Press; DOI: 10.12659/MSM.952191

Clinical Research
Effect of Indirect Co-Culture With Gingival Mesenchymal Stem Cells on Cytokine Secretion in Primary Oral Sq...

Med Sci Monit In Press; DOI: 10.12659/MSM.952439

Clinical Research
Comparison of Sleep Architecture in Individuals Aged 65 to 80 Years With and Without Mild Cognitive Impairm...

Med Sci Monit In Press; DOI: 10.12659/MSM.952493

Clinical Research
Effects of Single-Bout Endurance Exercise Intensity on Peripheral Neurotrophic Factors in Patients With Isc...

Med Sci Monit In Press; DOI: 10.12659/MSM.952089

Most Viewed Current Articles

17 Jan 2024 : Review article 14,176,514
Vaccination Guidelines for Pregnant Women: Addressing COVID-19 and the Omicron Variant

DOI :10.12659/MSM.942799

Med Sci Monit 2024; 30:e942799

0:00

13 Nov 2021 : Clinical Research 3,760,677
Acceptance of COVID-19 Vaccination and Its Associated Factors Among Cancer Patients Attending the Oncology ...

DOI :10.12659/MSM.932788

Med Sci Monit 2021; 27:e932788

0:00

14 Dec 2022 : Clinical Research 2,466,264
Prevalence and Variability of Allergen-Specific Immunoglobulin E in Patients with Elevated Tryptase Levels

DOI :10.12659/MSM.937990

Med Sci Monit 2022; 28:e937990

0:00

16 May 2023 : Clinical Research 708,906
Electrophysiological Testing for an Auditory Processing Disorder and Reading Performance in 54 School Stude...

DOI :10.12659/MSM.940387

Med Sci Monit 2023; 29:e940387

0:00

Early Detection of Lung Metastases in Breast Cancer Using YOLOv10 and Transfer Learning: A Diagnostic Accuracy Study

Abstract

Introduction

Material and Methods

Results

Discussion

Conclusions

Figures

Tables

References

Figures

Tables

In Press

Clinical Research Analysis of the Clinical Characteristics and Endoscopic Features of Phytobezoar-Induced Ulcers and Gastric ...

Clinical Research Effect of Indirect Co-Culture With Gingival Mesenchymal Stem Cells on Cytokine Secretion in Primary Oral Sq...

Clinical Research Comparison of Sleep Architecture in Individuals Aged 65 to 80 Years With and Without Mild Cognitive Impairm...

Clinical Research Effects of Single-Bout Endurance Exercise Intensity on Peripheral Neurotrophic Factors in Patients With Isc...

Most Viewed Current Articles

17 Jan 2024 : Review article 14,176,514 Vaccination Guidelines for Pregnant Women: Addressing COVID-19 and the Omicron Variant

13 Nov 2021 : Clinical Research 3,760,677 Acceptance of COVID-19 Vaccination and Its Associated Factors Among Cancer Patients Attending the Oncology ...

14 Dec 2022 : Clinical Research 2,466,264 Prevalence and Variability of Allergen-Specific Immunoglobulin E in Patients with Elevated Tryptase Levels

16 May 2023 : Clinical Research 708,906 Electrophysiological Testing for an Auditory Processing Disorder and Reading Performance in 54 School Stude...

Your Privacy

Clinical Research
Analysis of the Clinical Characteristics and Endoscopic Features of Phytobezoar-Induced Ulcers and Gastric ...

Clinical Research
Effect of Indirect Co-Culture With Gingival Mesenchymal Stem Cells on Cytokine Secretion in Primary Oral Sq...

Clinical Research
Comparison of Sleep Architecture in Individuals Aged 65 to 80 Years With and Without Mild Cognitive Impairm...

Clinical Research
Effects of Single-Bout Endurance Exercise Intensity on Peripheral Neurotrophic Factors in Patients With Isc...

17 Jan 2024 : Review article 14,176,514
Vaccination Guidelines for Pregnant Women: Addressing COVID-19 and the Omicron Variant

13 Nov 2021 : Clinical Research 3,760,677
Acceptance of COVID-19 Vaccination and Its Associated Factors Among Cancer Patients Attending the Oncology ...

14 Dec 2022 : Clinical Research 2,466,264
Prevalence and Variability of Allergen-Specific Immunoglobulin E in Patients with Elevated Tryptase Levels

16 May 2023 : Clinical Research 708,906
Electrophysiological Testing for an Auditory Processing Disorder and Reading Performance in 54 School Stude...