ANALYSIS OF FEASIBILITY OF INTEGRATED ASSESSMENT INSTRUMENTS TO MEASURE CRITICAL THINKING SKILLS AND SCIENTIFIC ATTITUDES OF HIGH SCHOOL STUDENTS ON ACID-BASE TITRATION MATERIALS

This research aims to develop and determine the feasibility of an integrated assessment instrument to measure the critical thinking skill and scientific attitude of senior high school students on acid-base titration material. This research used development model 4D with define, design, develop and disseminate stage. The product was assessed the legibility by 151 students and was assessed the feasibility by 7 chemistry teachers in Sleman Regency, Yogyakarta. The polytomous data generated by the instrument trial was analysed using the WINSTEPS 3.73 program and the Partial Credit Model 1-Parameter Logistic (PCM 1-PL) approach. The technique of analysing data consists of interviews, questionnaires and tests. Data collection analysis used the guidance of interview, questionnaire instrument user response, questionnaire of agility instrument, integrated assessment instrument in the form of description questions. The research result showed that the level of students’ legibility has a reliability value 0.80, and the user instrument response has a reliability value 0.94. The teachers’ response to the integrated assessment instrument was stated very good. Therefore, an integrated assessment instrument is feasible to be used to measure the critical thinking skill and science attitude of senior high school in on acid-base titration material.

by the teacher. However, the question made still dominant used a short answer. Limited to knowledge, understanding, and application will make students' abilities only used to solve low-level cognitive problems (Sastrawati, Rusdi & Syamsurizal, 2011). Gardner (Murtono & Miskiyah, 2014) said that the assessment of short-answer tests at the lowlevel cognitive level only captures a small part of the skills and intelligence of students. Even though at this time, students are required not only to have low-level thinking skills but also to have higher-order thinking skills (Istiyono, Mardapi & Suparno, 2014). Mardapi, Kumaidi & Kartowagiran (2011) stated that until today there are still many instruments of learning result both used by teachers for daily tests and used by the school for a general test that is still not meet the standard for a good test. Moreover, the teacher still prioritizes the assessment to the knowledge aspect while the assessment to the attitude is based on the teacher's subjective assumption. This case is in line with the statement of Zoller (2001) that the process of learning that is going on all this time only emphasize the cognitive aspect. Cantos et al. (2015) proposed that the assessment expected could reflect students' overall ability both in knowledge, attitude and skill side. This case is corresponding with the Minister of Education and Culture Regulation No. 66 of 2013 about the standard of educational assessment that stated the assessment in learning should include three aspects in learning such as knowledge, attitude and skill.
Based on the description, one of the alternatives that could be done by the teacher to measure students' critical thinking skills and scientific attitudes simultaneously is by integrated assessment. Integrated assessment is a process of students' learning results merging from various topics to be one series of efficient assessments (McPhun, 2010). The instrument of integrated assessment could help the teacher in doing the assessment to the students' learning result after following the learning.

Method
The type of this research is a Research and Development (R&D). The research method of Research and Development is a research method used to produce a certain product and examine the effectiveness of the product. This research model is a procedural model, i.e., descriptive research showing the steps that should be followed in producing a product. The development model in this research used development 4-D (four-D) which was developed by Thiagarajan, Semmel, & Semmel (1974). This model consists of four stages of development such as defining, designing, developing and disseminating (4-D). the development model was chosen since the sequence activities were designed and developed systems to create and develop the learning product such as assessment instrument.
This research and development were conducted at a senior high school in Sleman Regency, Yogyakarta, Indonesia. The legibility test of the integrated assessment instrument involved 151 students of class XI MIPA. Determination of the test subject for the legibility of the integrated assessment instrument used the purposive sampling technique.
The type of data used is instrument feasibility data. Types of data collection instruments include interview guidelines, instrument legibility questionnaire sheets, instrument user response questionnaire sheets, and integrated assessment instruments in the form of description tests. The data analysis technique used is qualitative data analysis and quantitative data analysis.
The data were analysed by descriptive quantitative approach then it was converted to be a category. The product was assessed by the chemistry teacher by using Scale Linkert as presented in Table 1 below. Less 2 5 Very less 1 Furthermore, the final score was converted to be the feasibility of product as presenind on Table 2. After assessing the user's response to the product in the form of instrument feasibility by the chemistry teacher, the next step is to conduct a legibility test for students. The final score of the legibility test was converted into the eligibility category. The instrument legibility data was obtained from the students' responses which are shown in Table 3. The next is the final score was converted into the feasibility category of product as presenind on Table 4.

Data Validity of Legibility Instrument Questionnaire
The validity of the legibility instrument questionnaire was obtained through assessment used the questionnaire validation of legibility instrument sheet conducted by material expert lecturers and learning evaluation expert lecturers. The study was carried out by giving an accuracy score for each item in the questionnaire statement. After obtaining the data from the analysis, the validity of the instrument legibility questionnaire was calculated by using equation 1 below.
Where, = score of legibility questionnaire ∑x = the number of maximum scores n = the number of questionnaire statement item After calculating the average content validity by two experts, the final score of the validation result was converted into a feasibility category. Based on the analysis result of material experts and evaluation experts, it was concluded that the average content validity index of the instrument readability questionnaire was 3.87 with a maximum score 4.00 which was included in a very good category. The validation result of the instrument legibility questionnaire is presented in Table 5 as follows.  Table 5, the result indicates that the instrument legibility questionnaire is feasible to be used for field trials. Other results obtained in the form of suggestions from expert judgment include the use of language, pictures, tables, graphs, materials and instrument displays. This case is done to make students do the questions well without any confusion and other disturbing factors. This case is in line with the opinion of Irwanto et al. (2017) that the instrument needs to be considered, especially in the use of developing language, so that students do not feel difficulties in understanding questions. This is important since language error is one of the factors that can affect test performance.

Data of Validation Result of Instrument User Response Questionnaire
The validity of the instrument user response questionnaire was obtained through an assessment using the instrument user response questionnaire validation sheet which was done by material expert lecturers and learning evaluation expert lecturers. The analysis was done by giving an accuracy score for each item in the questionnaire statement. After obtaining the data from the results of the analysis, then the validity of the instrument user response questionnaire was calculated using equation 1.
After calculating the average content validity by two experts, the final score of the validation result was converted into a feasibility category. Based on the analysis result of material experts and evaluation experts, it was concluded that the average validity index of the instrument user response questionnaire content was 3.88 with a maximum score of 4.00 which was included in a very good category. The results of the instrument user response questionnaire validation are presented in Table 6 as follows. Based on Table 6, the result shows that the instrument user response questionnaire is feasible to be used for field trials.

Data of Instrument Legibility Result
The instrument feasibility level is also supported by the result of the legibility test questionnaire given to 151 students. The legibility test of the instrument was obtained through the questionnaire consisting of 15 questions. The legibility test aims to determine students' understanding of the language use, pictures, tables, graphs, materials, and instrument displays. The results of the instrument legibility test are presented in Table 7 as follows.   Table 7 shows that most of the students responded strongly agree to all aspects measured in the instrument readability questionnaire with didactic, construction, and technical requirements obtaining a very good category. These results indicate that the language use, pictures, tables, graphs, and the material presented in the instrument is easy for students to understand and it is feasible to be tested in the field.

Product Assessment by Chemistry Teacher
The product assessment was done by seven chemistry teachers. The aim is to get suggestions and input from the chemistry teacher to complete the question that has been developed. The instrument assessment by the reviewer includes five components of feasibility which were described into 24 criteria. The product feasibility component consists of substance, construction, language, validity, and practicality aspect which are adapted from the provisions of the Minister of Education and Culture Number 66 of 2013 concerning educational assessment standards. The user response data for the integrated assessment instrument is presented in Table 8 as follows.  Table 8 shows that most chemistry teachers give very well responses to all aspects measured in the instrument user response questionnaire. These results indicate that the integrated assessment instrument is feasible for field trials.

Estimation of Instrument Reliability
All research instruments, besides it, has met the requirements of logical validity by expert judgment, it also needs to have empirical reliability. Azwar (2011) states that reliability means how far the results of measurement have trustworthiness, reliability, consistency, stability which can be trusted. The measurement results can be trusted if in several measurements of the same subject group, obtained the relative same result. The instrument of this research includes user responses and an instrument readability questionnaire. The results of the reliability analysis using the Winsteps program are presented in Table 9, while the details are presented in Figure 1 and Figure  2. The estimation of reliability of the research instrument presented on Table 9 shows that the response questionnaire and the instrument legibility questionnaire have sample reliability of 0.94 (very high) and 0.80 (high). Therefore, the instrument user response questionnaire and the instrument legibility questionnaire have met the requirements of empirical reliability.

Conclusion
Based on the result of data analysis and discussion, it can be concluded that the integrated assessment instrument development used model 4-D, the result showed that students' legibility level based on didactic, constructive and technical aspects have an average score 3.25 includes in very good category and also have the estimation of reliability 0.80 which classify as high. The chemistry teacher response to the integrated assessment instrument is very good with an average score 4.25 based on the substance, construction, language, validity and practicability. The estimated reliability of the instrument user response questionnaire is classified as very high with a value 0.94. Therefore, the integrated assessment instrument is appropriate to be used to measure senior high school students critical thinking skills and scientific attitude on acid-base titration material.