Article information
2025 , Volume 30, ¹ 4, p.145-158
Smagin S.I., Oklandikov V.E., Kozhevnikova T.V., Zhivotova A.A.
Neural network models for identifying toughly translating sentences
Purpose. This study addresses developing and evaluating neural network-based classification models for identifying sentences that are difficult to translate using machine translation systems. Methodology. We assemble and preprocess dataset sentences labeled by translation difficulty, apply tokenization and implement multiple neural network architectures for their classification. Three models are built: a simple recurrent network (A1) using SimpleRNN layers, a long short term memory network (A2), and a convolutional neural network (A3) with Conv1D layers. The models are trained and tested on the dataset using standard machine learning procedures, and their classification performance is evaluated using metrics such as accuracy and F1-score. Findings. The experimental results demonstrate that the LSTM-based architecture (A2) achieves the highest classification accuracy and F1-score among the proposed models, indicating its superior ability to capture complex features related to translation difficulty. All models yield satisfactory results, however clear differences in training dynamics and final performance metrics do occur. Detailed metric values for each architecture are reported, confirming the feasibility of using neural networks for this binary classification problem. Originality/value. A novel application of neural network classifiers to the problem of detecting translation-difficult sentences is presented. The developed dataset and models can improve pre-translation analysis and help optimize machine translation pipelines by flagging challenging inputs. The approach contributes to computational linguistics by exploring different neural architectures and offering a valuable resource for further study
[full text] Keywords: neural network, machine learning, algorithm, machine translation, classification
Author(s): Smagin Sergey Ivanovich Dr. , Correspondent member of RAS, Professor Position: Director Office: Computer Center FEB RAS Address: 680000, Russia, Khabarovsk
Phone Office: (4212) 22 72 67 E-mail: smagin@ccfebras.ru SPIN-code: 2419-4990Oklandikov Vladimir Evgenievich Position: engineer Office: Computational Center FEB RAS Address: 680000, Russia, Khabarovsk
Kozhevnikova Tatiana Vladimirovna Position: Head of department Office: Computational Center FEB RAS Address: 680000, Russia, Khabarovsk
Zhivotova Alyona Anatolyevna PhD. , Associate Professor Office: Komsomolsk-na-Amure State University Address: 681013, Russia, Komsomolsk-On-Amur
Bibliography link: Smagin S.I., Oklandikov V.E., Kozhevnikova T.V., Zhivotova A.A. Neural network models for identifying toughly translating sentences // Computational technologies. 2025. V. 30. ¹ 4. P. 145-158
|